A group of researchers, including several at Penn State University, has developed a computational model to streamline flood prediction in the continental USA. The team published their approach in Water Resources Research.
The Cooperative Institute for Research to Operations in Hydrology’s NOAA’s Cooperative Agreement, as well as the US Department of Energy, National Energy Research Scientific Computing Center and the California Department of Water Resources Atmospheric River Program, supported this research.
AI-powered flood model
The high-resolution differentiable hydrologic and routing model incorporates big data and physical readings – like data taken from river networks and river flow generation theories – into a system powered by artificial intelligence (AI) to simulate and predict water movement.
The team trained the model with a large dataset of streamflow information recorded from a total of 2,800 gauge stations – sites that measure streamflow in rivers – provided by the US Geological Survey, along with weather data and detailed basin information. Using 15 years’ worth of streamflow data, they tasked their model with predicting and creating a 40-year high-resolution streamflow simulation for river systems across the continental USA. They compared the simulation to the observed data, measuring the variance between the observations and the simulations. The researchers observed substantial improvements – overall by 30% – in streamflow prediction accuracy in approximately 4,000 gauge stations, which included the original 2,800 and additional gauge stations not included in the training data, compared to the current version of the NOAA’s National Water Model (NWM), especially in specific geological areas with unique structures.
“Our neural network approaches calibration by learning from the large datasets we have from past readings, while simultaneously considering the physics-based information from the NWM,” said Yalan Song, assistant research professor of civil and environmental engineering and a co-corresponding author on the paper. “This allows us to process large datasets very efficiently, without losing the level of detail a physics-based model provides, and at a higher level of consistency and reliability.”
According to Chaopeng Shen, professor of civil and environmental engineering at Penn State and co-corresponding author of the paper, their model is a candidate for use in the next generation framework of NWM that NOAA is developing to improve the standards of flood forecasting around the country. While not yet selected, Shen said their model is “highly competitive” as it is already coupled to this operational framework. However, it may still take time for model users to get comfortable with the AI component of the model, according to Shen, who explained that careful independent evaluations are required to demonstrate the model accuracy can be trusted even in untrained scenarios.
The team is working to close the final gap – improving the model’s prediction capability from daily to hourly – to make it more useful for operational applications, like hourly flood watches and warnings. Shen credited the research-to-operation work to civil engineering doctoral candidate Leo Lonzarich, noting that developing a framework other researchers can expand will be key to solving problems and evolving the model as a community.
“Once the model is trained, we can generate predictions at unprecedented speed,” Shen said. “In the past, generating 40 years of high-resolution data through the NWM could take weeks, and required many different super computers working together. Now, we can do it on one system, within hours, so this research could develop extremely rapidly and massively save costs.”
Comparison to legacy models
The report highlights that traditional models like NWM must undergo parameter calibration, where large datasets consisting of decades of historical streamflow data from around the USA are processed to set parameters and produce useful simulations. Although this model is widely used by organizations like the National Weather Service to inform flood forecasting, according to Shen, the parameter calibration makes the process very inefficient.
“To be accurate with this model, traditionally your data needs to be individually calibrated on a site-by-site basis,” Shen said. “This process is time consuming, expensive and tedious. Our team determined that incorporating machine learning into the calibration process across all the sites could massively improve efficiency and cost effectiveness.”
The team’s model implements a subset of AI techniques known as neural networks that efficiently recognizes complex patterns across large, dynamic datasets. Neural networks work like a human brain, creating logical connections between their units, and can effectively operate autonomously and improve over time as they analyze more data.
According to Song, the team’s model implements several types of neural networks to recognize the patterns of key parameters and learn how they change in time and space.
“By incorporating neural networking, we avoid the site-specific calibration issue and improve the model’s efficiency substantially,” Song said. “Rather than approaching each site individually, the neural network applies general principles it interprets from past data to make predictions. This greatly increases efficiency, while still accurately predicting streamflow in areas of the country it may be unfamiliar with.”
According to Shen, the lack of broad physical knowledge supporting the predictions of observational data and machine learning (ML) predictions can cause these models to downplay the intensity of previously unseen outliers in simulations. Shen said this can be dangerous in the context of flood prediction and increasing weather extremes, since it would downplay the actual risk. According to Song, the design of their model simultaneously offers the benefits of physics-based models and machine learning models, while improving the accuracy of extreme event predictions.
“The old approach is not only highly inefficient, but quite inconsistent,” Shen said. “With our new approach, we can create simulations using the same process, regardless of the region we are trying to simulate. As we process more data and create more predictions, our neural network will continue to improve. With a trained neural network, we can generate parameters for the entire US within minutes.”
“Because our model is physically interpretable, it can describe river basin features like soil moisture, the baseflow rate of rivers and groundwater recharge, which is very useful for agriculture and much harder for purely data-driven machine learning to produce,” Shen explained. “We can better understand natural systems that play critical roles in supporting ecosystems and the organisms within them all over the country.”
Alongside Shen and Song, the paper’s co-authors from Penn State include Tadd Bindas, who recently earned a doctorate in civil and environmental engineering; Kathryn Lawson, a research associate of deep learning in hydrology; and Penn State civil and environmental engineering doctoral candidates Haoyu Ji, Leo Lonzarich, Jiangtao Liu, Farshid Rahmani and Kamlesh Arun Sawadekar.
Additional co-authors include Wouter J M Knoben Cyril Thébault and Martyn P Clark, University of Calgary; Katie van Werkhoven, Sam Lamont and Matthew Denno, Research Triangle Institute; Ming Pan and Yuan Yang, Scripps Institution of Oceanography, University of California, San Diego; Jeremy Rapp, Michigan State University; Mukesh Kumar, Richard Adkins, James Halgren, Trupesh Patel and Arpita Patel, University of Alabama.
The Penn State team thanked the computing support offered by University of Alabama.
In related news, a study published in Water Resources Research recently introduced a deep learning framework to predict the rise and fall of water levels during storms – even in places where tide gauges fail or data is scarce – through a technique known as ‘transfer learning’. Read the full story here