Dawit Asfaw , Ryan G. Smith , Sayantan Majumdar , Katherine Grote , Bin Fang , B.B. Wilson , V. Lakshmi , J.J. Butler Jr.
{"title":"Predicting groundwater withdrawals using machine learning with limited metering data: Assessment of training data requirements","authors":"Dawit Asfaw , Ryan G. Smith , Sayantan Majumdar , Katherine Grote , Bin Fang , B.B. Wilson , V. Lakshmi , J.J. Butler Jr.","doi":"10.1016/j.agwat.2025.109691","DOIUrl":null,"url":null,"abstract":"<div><div>The future of major aquifer systems supporting irrigated agriculture is threatened due to unsustainable groundwater pumping. Metering of pumping is key for implementing robust groundwater management, but metering is limited in most aquifers. Although machine learning methods have been used to estimate pumping over certain regions, these studies have not fully demonstrated the data quantity and input parameter requirements to accurately estimate regional groundwater pumping. This study determined the data quantity required and identified relevant features to develop Random Forests-based annual groundwater pumping estimates (2008–2020) over the Kansas High Plains aquifer. We predicted pumping at two spatial scales, i.e., point (well) and grid (2 km). We evaluated a combination of different training splits against a constant test set to understand the performance of the models. Summing predicted pumping over a 2 km grid was made possible with knowledge of crop irrigation area. This knowledge also decreased the uncertainty observed in linking individual wells with irrigated areas and further improved the spatial and temporal pumping estimates. At the 2 km scale, we observed that a model trained on 10 % of the total available data had coefficient of determination (R<sup>2</sup>) values of 0.98 and 0.75 for training and testing, respectively. These results show reasonable estimates of irrigation pumping are possible at the 2 km scale when 10 % of irrigation wells are metered and if the irrigated area is known. This finding has significant implications for groundwater management in many heavily stressed aquifers.</div></div>","PeriodicalId":7634,"journal":{"name":"Agricultural Water Management","volume":"318 ","pages":"Article 109691"},"PeriodicalIF":6.5000,"publicationDate":"2025-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Agricultural Water Management","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378377425004056","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0
Abstract
The future of major aquifer systems supporting irrigated agriculture is threatened due to unsustainable groundwater pumping. Metering of pumping is key for implementing robust groundwater management, but metering is limited in most aquifers. Although machine learning methods have been used to estimate pumping over certain regions, these studies have not fully demonstrated the data quantity and input parameter requirements to accurately estimate regional groundwater pumping. This study determined the data quantity required and identified relevant features to develop Random Forests-based annual groundwater pumping estimates (2008–2020) over the Kansas High Plains aquifer. We predicted pumping at two spatial scales, i.e., point (well) and grid (2 km). We evaluated a combination of different training splits against a constant test set to understand the performance of the models. Summing predicted pumping over a 2 km grid was made possible with knowledge of crop irrigation area. This knowledge also decreased the uncertainty observed in linking individual wells with irrigated areas and further improved the spatial and temporal pumping estimates. At the 2 km scale, we observed that a model trained on 10 % of the total available data had coefficient of determination (R2) values of 0.98 and 0.75 for training and testing, respectively. These results show reasonable estimates of irrigation pumping are possible at the 2 km scale when 10 % of irrigation wells are metered and if the irrigated area is known. This finding has significant implications for groundwater management in many heavily stressed aquifers.
期刊介绍:
Agricultural Water Management publishes papers of international significance relating to the science, economics, and policy of agricultural water management. In all cases, manuscripts must address implications and provide insight regarding agricultural water management.