Groundwater fluoride prediction modeling using physicochemical parameters in Punjab, India: a machine-learning approach

IF 2.1 Q3 SOIL SCIENCE
Anjali Kerketta, Harmanpreet Singh Kapoor, Prafulla Kumar Sahoo
{"title":"Groundwater fluoride prediction modeling using physicochemical parameters in Punjab, India: a machine-learning approach","authors":"Anjali Kerketta, Harmanpreet Singh Kapoor, Prafulla Kumar Sahoo","doi":"10.3389/fsoil.2024.1407502","DOIUrl":null,"url":null,"abstract":"Rising fluoride levels in groundwater resources have become a worldwide concern, presenting a significant challenge to the safe utilization of water resources and posing potential risks to human well-being. Elevated fluoride and its vast spatial variability have been documented across different districts of Punjab, India, and it is, therefore, imperative to predict the fluoride levels for efficient groundwater resources planning and management.In this study, five different models, Support Vector Machine (SVM), Random Forest (RF), Extreme Gradient Boosting (Xgboost), Extreme Learning Machine (ELM), and Multilayer Perceptron (MLP), are proposed to predict groundwater fluoride using the physicochemical parameters and sampling depth as predictor variables. The performance of these five models was evaluated using the coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE).ELM outperformed the remaining four models, thus exhibiting a strong predictive power. The R2, MAE, and RMSE values for ELM at the training and testing stages were 0.85, 0.46, 0.36 and, 0.95, 0.31, and 0.33, respectively, while other models yielded inferior results. Based on the relative importance scores, total dissolved solids (TDS), electrical conductivity (EC), sodium (Na+), chloride (Cl−), and calcium (Ca2+) contributed significantly to model performance. High variability in the target (fluoride) and predictor variables might have led to the poor performance of the models, implying the need for better data pre-processing techniques to improve data quality. Although ELM showed satisfactory results, it can be considered a promising model for predicting groundwater quality.","PeriodicalId":73107,"journal":{"name":"Frontiers in soil science","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in soil science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fsoil.2024.1407502","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"SOIL SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Rising fluoride levels in groundwater resources have become a worldwide concern, presenting a significant challenge to the safe utilization of water resources and posing potential risks to human well-being. Elevated fluoride and its vast spatial variability have been documented across different districts of Punjab, India, and it is, therefore, imperative to predict the fluoride levels for efficient groundwater resources planning and management.In this study, five different models, Support Vector Machine (SVM), Random Forest (RF), Extreme Gradient Boosting (Xgboost), Extreme Learning Machine (ELM), and Multilayer Perceptron (MLP), are proposed to predict groundwater fluoride using the physicochemical parameters and sampling depth as predictor variables. The performance of these five models was evaluated using the coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE).ELM outperformed the remaining four models, thus exhibiting a strong predictive power. The R2, MAE, and RMSE values for ELM at the training and testing stages were 0.85, 0.46, 0.36 and, 0.95, 0.31, and 0.33, respectively, while other models yielded inferior results. Based on the relative importance scores, total dissolved solids (TDS), electrical conductivity (EC), sodium (Na+), chloride (Cl−), and calcium (Ca2+) contributed significantly to model performance. High variability in the target (fluoride) and predictor variables might have led to the poor performance of the models, implying the need for better data pre-processing techniques to improve data quality. Although ELM showed satisfactory results, it can be considered a promising model for predicting groundwater quality.
利用物理化学参数建立印度旁遮普省地下水氟化物预测模型:一种机器学习方法
地下水资源中氟化物含量的上升已成为全世界关注的问题,这对水资源的安全利用提出了重大挑战,并对人类福祉构成了潜在风险。在印度旁遮普省的不同地区,氟含量升高及其巨大的空间变化已被记录在案,因此,预测氟含量以进行有效的地下水资源规划和管理势在必行。本研究提出了支持向量机 (SVM)、随机森林 (RF)、极端梯度提升 (Xgboost)、极端学习机 (ELM) 和多层感知器 (MLP) 五种不同的模型,以物理化学参数和采样深度作为预测变量来预测地下水含氟量。使用判定系数(R2)、平均绝对误差(MAE)和均方根误差(RMSE)对这五个模型的性能进行了评估。在训练和测试阶段,ELM 的 R2、MAE 和 RMSE 值分别为 0.85、0.46、0.36 和 0.95、0.31 和 0.33,而其他模型的结果较差。根据相对重要性得分,总溶解固体(TDS)、电导率(EC)、钠(Na+)、氯(Cl-)和钙(Ca2+)对模型性能的贡献较大。目标变量(氟化物)和预测变量的高变异性可能导致模型性能不佳,这意味着需要更好的数据预处理技术来提高数据质量。尽管 ELM 显示出令人满意的结果,但它仍可被视为预测地下水质量的一种有前途的模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.90
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信