Predicting groundwater withdrawals using machine learning with limited metering data: Assessment of training data requirements

IF 6.5 1区 农林科学 Q1 AGRONOMY
Dawit Asfaw , Ryan G. Smith , Sayantan Majumdar , Katherine Grote , Bin Fang , B.B. Wilson , V. Lakshmi , J.J. Butler Jr.
{"title":"Predicting groundwater withdrawals using machine learning with limited metering data: Assessment of training data requirements","authors":"Dawit Asfaw ,&nbsp;Ryan G. Smith ,&nbsp;Sayantan Majumdar ,&nbsp;Katherine Grote ,&nbsp;Bin Fang ,&nbsp;B.B. Wilson ,&nbsp;V. Lakshmi ,&nbsp;J.J. Butler Jr.","doi":"10.1016/j.agwat.2025.109691","DOIUrl":null,"url":null,"abstract":"<div><div>The future of major aquifer systems supporting irrigated agriculture is threatened due to unsustainable groundwater pumping. Metering of pumping is key for implementing robust groundwater management, but metering is limited in most aquifers. Although machine learning methods have been used to estimate pumping over certain regions, these studies have not fully demonstrated the data quantity and input parameter requirements to accurately estimate regional groundwater pumping. This study determined the data quantity required and identified relevant features to develop Random Forests-based annual groundwater pumping estimates (2008–2020) over the Kansas High Plains aquifer. We predicted pumping at two spatial scales, i.e., point (well) and grid (2 km). We evaluated a combination of different training splits against a constant test set to understand the performance of the models. Summing predicted pumping over a 2 km grid was made possible with knowledge of crop irrigation area. This knowledge also decreased the uncertainty observed in linking individual wells with irrigated areas and further improved the spatial and temporal pumping estimates. At the 2 km scale, we observed that a model trained on 10 % of the total available data had coefficient of determination (R<sup>2</sup>) values of 0.98 and 0.75 for training and testing, respectively. These results show reasonable estimates of irrigation pumping are possible at the 2 km scale when 10 % of irrigation wells are metered and if the irrigated area is known. This finding has significant implications for groundwater management in many heavily stressed aquifers.</div></div>","PeriodicalId":7634,"journal":{"name":"Agricultural Water Management","volume":"318 ","pages":"Article 109691"},"PeriodicalIF":6.5000,"publicationDate":"2025-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Agricultural Water Management","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378377425004056","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0

Abstract

The future of major aquifer systems supporting irrigated agriculture is threatened due to unsustainable groundwater pumping. Metering of pumping is key for implementing robust groundwater management, but metering is limited in most aquifers. Although machine learning methods have been used to estimate pumping over certain regions, these studies have not fully demonstrated the data quantity and input parameter requirements to accurately estimate regional groundwater pumping. This study determined the data quantity required and identified relevant features to develop Random Forests-based annual groundwater pumping estimates (2008–2020) over the Kansas High Plains aquifer. We predicted pumping at two spatial scales, i.e., point (well) and grid (2 km). We evaluated a combination of different training splits against a constant test set to understand the performance of the models. Summing predicted pumping over a 2 km grid was made possible with knowledge of crop irrigation area. This knowledge also decreased the uncertainty observed in linking individual wells with irrigated areas and further improved the spatial and temporal pumping estimates. At the 2 km scale, we observed that a model trained on 10 % of the total available data had coefficient of determination (R2) values of 0.98 and 0.75 for training and testing, respectively. These results show reasonable estimates of irrigation pumping are possible at the 2 km scale when 10 % of irrigation wells are metered and if the irrigated area is known. This finding has significant implications for groundwater management in many heavily stressed aquifers.
使用有限计量数据的机器学习预测地下水提取:培训数据需求的评估
由于不可持续的地下水抽取,支持灌溉农业的主要含水层系统的未来受到威胁。抽水计量是实施强有力的地下水管理的关键,但计量在大多数含水层是有限的。虽然已经使用机器学习方法来估计某些区域的抽水,但这些研究并没有充分展示准确估计区域地下水抽水的数据量和输入参数要求。本研究确定了所需的数据量,并确定了在堪萨斯高平原含水层上开发基于随机森林的年度地下水抽水估算(2008-2020)的相关特征。我们预测了两个空间尺度上的抽水,即点(井)和网格(2 km)。我们针对一个恒定的测试集评估了不同训练分割的组合,以了解模型的性能。通过对作物灌溉面积的了解,可以对2 公里网格内的预测抽水进行汇总。这些知识还减少了在将单井与灌溉区连接时观察到的不确定性,并进一步改善了空间和时间的抽水估算。在2 km尺度上,我们观察到,在总可用数据的10 %上训练的模型,其训练和测试的决定系数(R2)分别为0.98和0.75。这些结果表明,在10 %的灌溉井进行计量和灌溉面积已知的情况下,可以在2 km尺度上合理估计灌溉抽水。这一发现对许多压力很大的含水层的地下水管理具有重要意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Agricultural Water Management
Agricultural Water Management 农林科学-农艺学
CiteScore
12.10
自引率
14.90%
发文量
648
审稿时长
4.9 months
期刊介绍: Agricultural Water Management publishes papers of international significance relating to the science, economics, and policy of agricultural water management. In all cases, manuscripts must address implications and provide insight regarding agricultural water management.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信