George H. Myers, Kristen L. Underwood, Rebecca M. Diehl, Donna M. Rizzo, Tiffany L. Chin, Eric D. Roy
{"title":"From Rivers to Floodplains: Leveraging Transfer Learning to Predict Floodplain Dissolved Oxygen","authors":"George H. Myers, Kristen L. Underwood, Rebecca M. Diehl, Donna M. Rizzo, Tiffany L. Chin, Eric D. Roy","doi":"10.1029/2024wr039820","DOIUrl":null,"url":null,"abstract":"Dissolved oxygen (DO) regulates the dominant biogeochemical processes in floodplains and is an important water quality indicator. However, predicting DO dynamics with data driven methods in floodplains is challenging due to data scarcity, limiting our understanding of the efficacy of floodplain restoration for clean water objectives. This study applies domain adaptation transfer learning (TL) to a long short‐term memory (LSTM) model to generate floodplain DO predictions. First, a LSTM model was trained on a data‐rich river “source domain” and then used to predict floodplain DO. The trained river model was used to initialize a new TL LSTM model which was finetuned to the floodplain “target domain,” where the same type of monitoring data were scarcer. A third LSTM model was trained only on the floodplain data, and performance was compared across the three models. The TL model outperformed the river model and performed slightly better than the floodplain model (TL model—root mean squared error (RMSE): 2.79; floodplain model—RMSE: 2.90; river model—RMSE: 4.40). Shapley additive explanation (SHAP) values revealed that while the floodplain model relied more heavily on site‐specific attributes, the TL model encoded relationships with dynamic drivers, capturing process‐informed behavior from both river and floodplain domains. Our findings suggest that TL produces models that generalize better across sites and are more robust to variable conditions, offering both predictive skill and process insight. Our modeling framework offers a scalable and interpretable solution for data‐scarce environments, with broad applicability across water resources and Earth system sciences.","PeriodicalId":23799,"journal":{"name":"Water Resources Research","volume":"84 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1029/2024wr039820","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Dissolved oxygen (DO) regulates the dominant biogeochemical processes in floodplains and is an important water quality indicator. However, predicting DO dynamics with data driven methods in floodplains is challenging due to data scarcity, limiting our understanding of the efficacy of floodplain restoration for clean water objectives. This study applies domain adaptation transfer learning (TL) to a long short‐term memory (LSTM) model to generate floodplain DO predictions. First, a LSTM model was trained on a data‐rich river “source domain” and then used to predict floodplain DO. The trained river model was used to initialize a new TL LSTM model which was finetuned to the floodplain “target domain,” where the same type of monitoring data were scarcer. A third LSTM model was trained only on the floodplain data, and performance was compared across the three models. The TL model outperformed the river model and performed slightly better than the floodplain model (TL model—root mean squared error (RMSE): 2.79; floodplain model—RMSE: 2.90; river model—RMSE: 4.40). Shapley additive explanation (SHAP) values revealed that while the floodplain model relied more heavily on site‐specific attributes, the TL model encoded relationships with dynamic drivers, capturing process‐informed behavior from both river and floodplain domains. Our findings suggest that TL produces models that generalize better across sites and are more robust to variable conditions, offering both predictive skill and process insight. Our modeling framework offers a scalable and interpretable solution for data‐scarce environments, with broad applicability across water resources and Earth system sciences.
期刊介绍:
Water Resources Research (WRR) is an interdisciplinary journal that focuses on hydrology and water resources. It publishes original research in the natural and social sciences of water. It emphasizes the role of water in the Earth system, including physical, chemical, biological, and ecological processes in water resources research and management, including social, policy, and public health implications. It encompasses observational, experimental, theoretical, analytical, numerical, and data-driven approaches that advance the science of water and its management. Submissions are evaluated for their novelty, accuracy, significance, and broader implications of the findings.