改进无测量流域的排水预测：利用分类数据建模和机器学习的力量

IF 5 1区地球科学 Q2 ENVIRONMENTAL SCIENCES

Water Resources Research Pub Date : 2024-09-18 DOI:10.1029/2024wr037122

Aggrey Muhebwa, Colin J. Gleason, Dongmei Feng, Jay Taneja

{"title":"改进无测量流域的排水预测：利用分类数据建模和机器学习的力量","authors":"Aggrey Muhebwa, Colin J. Gleason, Dongmei Feng, Jay Taneja","doi":"10.1029/2024wr037122","DOIUrl":null,"url":null,"abstract":"Current machine learning methods for discharge prediction often employ aggregated basin-wide hydrometeorological data (lumped modeling) for parametric and non-parametric training. This approach may overlook the spatial heterogeneity of river systems and their impact on discharge patterns. We hypothesize that integrating spatiotemporal hydrologic knowledge into the data modeling process (distributed/disaggregated modeling) can improve the performance of discharge prediction models. To test this hypothesis, we designed experiments comparing the performance of identical Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) models forced with either lumped or distributed features. We gather meteorological forcing and static attributes for the Mackenzie basin in Canada- a large and unique basin. Importantly, discharge performance is assessed out-of-sample with k-fold replication across gauges. Training LSTMs with disaggregated data significantly improved model accuracy. Specifically, there was a 9.6% increase in the mean Nash-Sutcliffe Efficiency and a 4.6% increase in the mean Kling-Gupta Efficiency, indicating a better agreement between predicted and actual observations in terms of mean, variability, and correlation. These experiments and results demonstrate the importance of integrating topologically guided geomorphologic and hydrologic information (distributed modeling) in data-driven discharge predictions.","PeriodicalId":23799,"journal":{"name":"Water Resources Research","volume":"332 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving Discharge Predictions in Ungauged Basins: Harnessing the Power of Disaggregated Data Modeling and Machine Learning\",\"authors\":\"Aggrey Muhebwa, Colin J. Gleason, Dongmei Feng, Jay Taneja\",\"doi\":\"10.1029/2024wr037122\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Current machine learning methods for discharge prediction often employ aggregated basin-wide hydrometeorological data (lumped modeling) for parametric and non-parametric training. This approach may overlook the spatial heterogeneity of river systems and their impact on discharge patterns. We hypothesize that integrating spatiotemporal hydrologic knowledge into the data modeling process (distributed/disaggregated modeling) can improve the performance of discharge prediction models. To test this hypothesis, we designed experiments comparing the performance of identical Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) models forced with either lumped or distributed features. We gather meteorological forcing and static attributes for the Mackenzie basin in Canada- a large and unique basin. Importantly, discharge performance is assessed out-of-sample with k-fold replication across gauges. Training LSTMs with disaggregated data significantly improved model accuracy. Specifically, there was a 9.6% increase in the mean Nash-Sutcliffe Efficiency and a 4.6% increase in the mean Kling-Gupta Efficiency, indicating a better agreement between predicted and actual observations in terms of mean, variability, and correlation. These experiments and results demonstrate the importance of integrating topologically guided geomorphologic and hydrologic information (distributed modeling) in data-driven discharge predictions.\",\"PeriodicalId\":23799,\"journal\":{\"name\":\"Water Resources Research\",\"volume\":\"332 1\",\"pages\":\"\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Water Resources Research\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.1029/2024wr037122\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water Resources Research","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1029/2024wr037122","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

目前用于排泄量预测的机器学习方法通常采用全流域水文气象数据（集合建模）进行参数和非参数训练。这种方法可能会忽略河流系统的空间异质性及其对排水模式的影响。我们假设，将时空水文知识整合到数据建模过程中（分布式/分类建模）可以提高排水预测模型的性能。为了验证这一假设，我们设计了一些实验，比较相同的长短期记忆递归神经网络（LSTM-RNN）模型的性能，这些模型采用了集合特征或分布特征。我们收集了加拿大麦肯齐流域--一个巨大而独特的流域--的气象强迫和静态属性。重要的是，排放性能是通过在不同测站间进行 k 倍复制来进行样本外评估的。使用分类数据训练 LSTM 可显著提高模型精度。具体来说，平均 Nash-Sutcliffe 效率提高了 9.6%，平均 Kling-Gupta 效率提高了 4.6%，这表明在平均值、变异性和相关性方面，预测值与实际观测值之间的一致性更好。这些实验和结果表明，在数据驱动的排水预测中整合地形引导的地貌和水文信息（分布式建模）非常重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving Discharge Predictions in Ungauged Basins: Harnessing the Power of Disaggregated Data Modeling and Machine Learning

Current machine learning methods for discharge prediction often employ aggregated basin-wide hydrometeorological data (lumped modeling) for parametric and non-parametric training. This approach may overlook the spatial heterogeneity of river systems and their impact on discharge patterns. We hypothesize that integrating spatiotemporal hydrologic knowledge into the data modeling process (distributed/disaggregated modeling) can improve the performance of discharge prediction models. To test this hypothesis, we designed experiments comparing the performance of identical Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) models forced with either lumped or distributed features. We gather meteorological forcing and static attributes for the Mackenzie basin in Canada- a large and unique basin. Importantly, discharge performance is assessed out-of-sample with k-fold replication across gauges. Training LSTMs with disaggregated data significantly improved model accuracy. Specifically, there was a 9.6% increase in the mean Nash-Sutcliffe Efficiency and a 4.6% increase in the mean Kling-Gupta Efficiency, indicating a better agreement between predicted and actual observations in terms of mean, variability, and correlation. These experiments and results demonstrate the importance of integrating topologically guided geomorphologic and hydrologic information (distributed modeling) in data-driven discharge predictions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Water Resources Research 环境科学-湖沼学

CiteScore

8.80

自引率

13.00%

发文量

599

审稿时长

3.5 months

期刊介绍： Water Resources Research (WRR) is an interdisciplinary journal that focuses on hydrology and water resources. It publishes original research in the natural and social sciences of water. It emphasizes the role of water in the Earth system, including physical, chemical, biological, and ecological processes in water resources research and management, including social, policy, and public health implications. It encompasses observational, experimental, theoretical, analytical, numerical, and data-driven approaches that advance the science of water and its management. Submissions are evaluated for their novelty, accuracy, significance, and broader implications of the findings.