Generalization of Runoff Risk Prediction at Field Scales to a Continental-Scale Region Using Cluster Analysis and Hybrid Modeling

Abstract
As surface water resources in the U.S. continue to be pressured by excess nutrients carried by agricultural runoff, the need to assess runoff risk at the field scale continues to grow in importance. Most landscape hydrologic models developed at regional scales have limited applicability at finer spatial scales. Hybrid models can be used to address the scale mismatch between model simulation and applicability, but could be limited by their ability to generalize over a large domain with heterogeneous hydrologic characteristics. To assist the generalization, we develop a regionalization approach based on the principal component analysis and K-means clustering to identify the clusters with similar runoff potential over the Great Lakes region. For each cluster, hybrid models are developed by combining National Oceanic and Atmospheric Administration's National Water Model and a data-driven model, eXtreme gradient boosting with field-scale measurements, enabling prediction of daily runoff risk level at the field scale over the entire region. Key Points: Identify five clusters in the Great Lakes region with similar runoff potential Generalize hybrid models developed at field scales to a continental-scale region Predict daily runoff risk on 1 km-by-1 km grid over the entire Great Lakes region Plain Language Summary: Nutrient loading is an important factor determining water quality in the Great Lakes. Transport of nutrients to surface water is often correlated with runoff, causing detrimental effects to aquatic ecosystems, such as harmful algal blooms. Runoff risk forecasts constituting an early warning system can be used to improve timing of nutrient application, leading to dual benefits of reducing nutrient transport to surface water and leaving more nutrients in the field for crop growth. However, measurements of the edge-of-field runoff are conducted at the field scale and sparse over the Great Lakes region, posing a great challenge to developing such a warning system over the continental scale. To address the challenge, we developed a generalization approach that allows predictive models developed using the runoff measurements at the field scale to be generalized to large regions with similar hydrogeologic characteristics. We can then predict the daily runoff risk level over the entire Great Lakes domain at 1 km-by-1 km resolution, which shows promise to be the backbone of the early warning system on the forecast of daily risk level for the Contiguous U.S.
Description
Copyright 2022 American Geophysical Union. This article was originally published in Geophysical Research Letters. The version of record is available at: https://doi.org/10.1029/2022GL100667. This article will be embargoed until 02/26/2023.
Keywords
runoff potential, clustering, XGBoost, National Water Model, hybrid modeling, generalization
Citation
Ford, C. M., Hu, Y., Ghosh, C., Fry, L. M., Malakpour-Estalaki, S., Mason, L., et al. (2022). Generalization of runoff risk prediction at field scales to a continental-scale region using cluster analysis and hybrid modeling. Geophysical Research Letters, 49, e2022GL100667. https://doi.org/10.1029/2022GL100667