Bewuket B Tefera, Jane Southworth, Joann Mossa, Mashoukur Rahaman, Mohammad Safaei, Di Yang, Shankar Karuppannan
{"title":"阿瓦什河上游流域(埃塞俄比亚)地下水质量对土地覆盖和岩性的预测响应。","authors":"Bewuket B Tefera, Jane Southworth, Joann Mossa, Mashoukur Rahaman, Mohammad Safaei, Di Yang, Shankar Karuppannan","doi":"10.1016/j.jenvman.2025.127572","DOIUrl":null,"url":null,"abstract":"<p><p>Groundwater resources are vital for human and environmental needs, especially in humid and semi-arid regions. Conventional groundwater quality models, including statistical and single-algorithm machine learning techniques, often lack accuracy, interpretability, and scalability. This study presents an advanced ensemble machine learning framework for assessing groundwater quality in Ethiopia's Upper Awash River Basin, Africa. The Entropy Weighted Water Quality Index (EWQI) consolidates 13 hydrochemical parameters, including electrical conductivity, total dissolved solids, pH, and major ions. Data preprocessing involved imputation, standardization, and partitioning into training sets (70 %) and testing sets (30 %). Predictors include elevation, slope, land cover, lithology, and soil characteristics (type, moisture, and temperature). A novel stacking ensemble model was developed using Random Forest, Gradient Boosting, Support Vector Regression, K-Nearest Neighbors, and EXtreme Gradient Boosting. The stacking model outperformed individual models, achieving training metrics of MSE 17.96, RMSE 4.24, and R<sup>2</sup> 0.97, as well as testing metrics of MSE 76.29, RMSE 8.73, and R<sup>2</sup> 0.87. The validation results showed an MSE of 67.18, an RMSE of 8.2, and an R<sup>2</sup> of 0.89. Beyond accuracy, SHAP interpretation shows that soil temperature, land cover, and soil moisture are the dominant drivers of EWQI, exceeding terrain and lithologic controls. By coupling an objective EWQI target with broadly available covariates and an interpretable stacked ensemble, the study links prediction to actionable land and water management in a data-scarce basin and outlines a transferable workflow.</p>","PeriodicalId":356,"journal":{"name":"Journal of Environmental Management","volume":"394 ","pages":"127572"},"PeriodicalIF":8.4000,"publicationDate":"2025-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predictive groundwater quality responses to land cover and lithology in the upper Awash River basin (Ethiopia) with stacking ensembles.\",\"authors\":\"Bewuket B Tefera, Jane Southworth, Joann Mossa, Mashoukur Rahaman, Mohammad Safaei, Di Yang, Shankar Karuppannan\",\"doi\":\"10.1016/j.jenvman.2025.127572\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Groundwater resources are vital for human and environmental needs, especially in humid and semi-arid regions. Conventional groundwater quality models, including statistical and single-algorithm machine learning techniques, often lack accuracy, interpretability, and scalability. This study presents an advanced ensemble machine learning framework for assessing groundwater quality in Ethiopia's Upper Awash River Basin, Africa. The Entropy Weighted Water Quality Index (EWQI) consolidates 13 hydrochemical parameters, including electrical conductivity, total dissolved solids, pH, and major ions. Data preprocessing involved imputation, standardization, and partitioning into training sets (70 %) and testing sets (30 %). Predictors include elevation, slope, land cover, lithology, and soil characteristics (type, moisture, and temperature). A novel stacking ensemble model was developed using Random Forest, Gradient Boosting, Support Vector Regression, K-Nearest Neighbors, and EXtreme Gradient Boosting. The stacking model outperformed individual models, achieving training metrics of MSE 17.96, RMSE 4.24, and R<sup>2</sup> 0.97, as well as testing metrics of MSE 76.29, RMSE 8.73, and R<sup>2</sup> 0.87. The validation results showed an MSE of 67.18, an RMSE of 8.2, and an R<sup>2</sup> of 0.89. Beyond accuracy, SHAP interpretation shows that soil temperature, land cover, and soil moisture are the dominant drivers of EWQI, exceeding terrain and lithologic controls. By coupling an objective EWQI target with broadly available covariates and an interpretable stacked ensemble, the study links prediction to actionable land and water management in a data-scarce basin and outlines a transferable workflow.</p>\",\"PeriodicalId\":356,\"journal\":{\"name\":\"Journal of Environmental Management\",\"volume\":\"394 \",\"pages\":\"127572\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2025-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Environmental Management\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://doi.org/10.1016/j.jenvman.2025.127572\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Environmental Management","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1016/j.jenvman.2025.127572","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
Predictive groundwater quality responses to land cover and lithology in the upper Awash River basin (Ethiopia) with stacking ensembles.
Groundwater resources are vital for human and environmental needs, especially in humid and semi-arid regions. Conventional groundwater quality models, including statistical and single-algorithm machine learning techniques, often lack accuracy, interpretability, and scalability. This study presents an advanced ensemble machine learning framework for assessing groundwater quality in Ethiopia's Upper Awash River Basin, Africa. The Entropy Weighted Water Quality Index (EWQI) consolidates 13 hydrochemical parameters, including electrical conductivity, total dissolved solids, pH, and major ions. Data preprocessing involved imputation, standardization, and partitioning into training sets (70 %) and testing sets (30 %). Predictors include elevation, slope, land cover, lithology, and soil characteristics (type, moisture, and temperature). A novel stacking ensemble model was developed using Random Forest, Gradient Boosting, Support Vector Regression, K-Nearest Neighbors, and EXtreme Gradient Boosting. The stacking model outperformed individual models, achieving training metrics of MSE 17.96, RMSE 4.24, and R2 0.97, as well as testing metrics of MSE 76.29, RMSE 8.73, and R2 0.87. The validation results showed an MSE of 67.18, an RMSE of 8.2, and an R2 of 0.89. Beyond accuracy, SHAP interpretation shows that soil temperature, land cover, and soil moisture are the dominant drivers of EWQI, exceeding terrain and lithologic controls. By coupling an objective EWQI target with broadly available covariates and an interpretable stacked ensemble, the study links prediction to actionable land and water management in a data-scarce basin and outlines a transferable workflow.
期刊介绍:
The Journal of Environmental Management is a journal for the publication of peer reviewed, original research for all aspects of management and the managed use of the environment, both natural and man-made.Critical review articles are also welcome; submission of these is strongly encouraged.