Kouao Laurent Kouadio , Jianxin Liu , Wenxiang Liu , Rong Liu
{"title":"预测含水层渗透系数K的混合学习策略","authors":"Kouao Laurent Kouadio , Jianxin Liu , Wenxiang Liu , Rong Liu","doi":"10.1016/j.cageo.2024.105819","DOIUrl":null,"url":null,"abstract":"<div><div>Aquifers permeability coefficient (K) is critical for understanding, managing, and protecting groundwater resources. However, obtaining reliable K values directly from pumping tests is costly and time-consuming, often yielding suboptimal results that lead to significant financial losses. Recent advances in machine learning offer an alternative, cost-effective approach for estimating K. Yet, the primary challenge lies in the substantial proportion of missing K data, as K measurements can only be recorded in aquifer layers. Such sparse and incomplete data severely limit the effectiveness of classical supervised learning methods. To address this challenge, we propose a mixture learning strategy (MXS) that combines unsupervised and supervised techniques to improve K prediction. First, a K-Means clustering approach is applied to delineate a naïve group of aquifers (NGA), effectively generating proxy labels for layers where direct K measurements are unavailable. Next, these NGA labels are integrated with existing K values to form enhanced input features for supervised prediction. We then apply support vector machines (SVMs) and extreme gradient boosting (XGB) to predict K more accurately. Experimental results show that both SVMs and XGB achieve prediction accuracies exceeding 80% when evaluated using confusion matrices and micro- and macro-averaged precision-recall metrics. Testing the MXS approach on an independent borehole dataset confirms its robustness and effectiveness. By enabling accurate K predictions in the presence of significant data gaps, MXS supports more informed decision-making, reduces the likelihood of unsuccessful pumping tests, and aids in the sustainable planning and management of groundwater resources.</div></div>","PeriodicalId":55221,"journal":{"name":"Computers & Geosciences","volume":"196 ","pages":"Article 105819"},"PeriodicalIF":4.2000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A mixture learning strategy for predicting aquifer permeability coefficient K\",\"authors\":\"Kouao Laurent Kouadio , Jianxin Liu , Wenxiang Liu , Rong Liu\",\"doi\":\"10.1016/j.cageo.2024.105819\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Aquifers permeability coefficient (K) is critical for understanding, managing, and protecting groundwater resources. However, obtaining reliable K values directly from pumping tests is costly and time-consuming, often yielding suboptimal results that lead to significant financial losses. Recent advances in machine learning offer an alternative, cost-effective approach for estimating K. Yet, the primary challenge lies in the substantial proportion of missing K data, as K measurements can only be recorded in aquifer layers. Such sparse and incomplete data severely limit the effectiveness of classical supervised learning methods. To address this challenge, we propose a mixture learning strategy (MXS) that combines unsupervised and supervised techniques to improve K prediction. First, a K-Means clustering approach is applied to delineate a naïve group of aquifers (NGA), effectively generating proxy labels for layers where direct K measurements are unavailable. Next, these NGA labels are integrated with existing K values to form enhanced input features for supervised prediction. We then apply support vector machines (SVMs) and extreme gradient boosting (XGB) to predict K more accurately. Experimental results show that both SVMs and XGB achieve prediction accuracies exceeding 80% when evaluated using confusion matrices and micro- and macro-averaged precision-recall metrics. Testing the MXS approach on an independent borehole dataset confirms its robustness and effectiveness. By enabling accurate K predictions in the presence of significant data gaps, MXS supports more informed decision-making, reduces the likelihood of unsuccessful pumping tests, and aids in the sustainable planning and management of groundwater resources.</div></div>\",\"PeriodicalId\":55221,\"journal\":{\"name\":\"Computers & Geosciences\",\"volume\":\"196 \",\"pages\":\"Article 105819\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Geosciences\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0098300424003029\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Geosciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098300424003029","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
A mixture learning strategy for predicting aquifer permeability coefficient K
Aquifers permeability coefficient (K) is critical for understanding, managing, and protecting groundwater resources. However, obtaining reliable K values directly from pumping tests is costly and time-consuming, often yielding suboptimal results that lead to significant financial losses. Recent advances in machine learning offer an alternative, cost-effective approach for estimating K. Yet, the primary challenge lies in the substantial proportion of missing K data, as K measurements can only be recorded in aquifer layers. Such sparse and incomplete data severely limit the effectiveness of classical supervised learning methods. To address this challenge, we propose a mixture learning strategy (MXS) that combines unsupervised and supervised techniques to improve K prediction. First, a K-Means clustering approach is applied to delineate a naïve group of aquifers (NGA), effectively generating proxy labels for layers where direct K measurements are unavailable. Next, these NGA labels are integrated with existing K values to form enhanced input features for supervised prediction. We then apply support vector machines (SVMs) and extreme gradient boosting (XGB) to predict K more accurately. Experimental results show that both SVMs and XGB achieve prediction accuracies exceeding 80% when evaluated using confusion matrices and micro- and macro-averaged precision-recall metrics. Testing the MXS approach on an independent borehole dataset confirms its robustness and effectiveness. By enabling accurate K predictions in the presence of significant data gaps, MXS supports more informed decision-making, reduces the likelihood of unsuccessful pumping tests, and aids in the sustainable planning and management of groundwater resources.
期刊介绍:
Computers & Geosciences publishes high impact, original research at the interface between Computer Sciences and Geosciences. Publications should apply modern computer science paradigms, whether computational or informatics-based, to address problems in the geosciences.