Groundwater fluoride prediction modeling using physicochemical parameters in Punjab, India: a machine-learning approach

IF 4.6 Q2 MATERIALS SCIENCE, BIOMATERIALS

ACS Applied Bio Materials Pub Date : 2024-07-18 DOI:10.3389/fsoil.2024.1407502

Anjali Kerketta, Harmanpreet Singh Kapoor, Prafulla Kumar Sahoo

{"title":"Groundwater fluoride prediction modeling using physicochemical parameters in Punjab, India: a machine-learning approach","authors":"Anjali Kerketta, Harmanpreet Singh Kapoor, Prafulla Kumar Sahoo","doi":"10.3389/fsoil.2024.1407502","DOIUrl":null,"url":null,"abstract":"Rising fluoride levels in groundwater resources have become a worldwide concern, presenting a significant challenge to the safe utilization of water resources and posing potential risks to human well-being. Elevated fluoride and its vast spatial variability have been documented across different districts of Punjab, India, and it is, therefore, imperative to predict the fluoride levels for efficient groundwater resources planning and management.In this study, five different models, Support Vector Machine (SVM), Random Forest (RF), Extreme Gradient Boosting (Xgboost), Extreme Learning Machine (ELM), and Multilayer Perceptron (MLP), are proposed to predict groundwater fluoride using the physicochemical parameters and sampling depth as predictor variables. The performance of these five models was evaluated using the coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE).ELM outperformed the remaining four models, thus exhibiting a strong predictive power. The R2, MAE, and RMSE values for ELM at the training and testing stages were 0.85, 0.46, 0.36 and, 0.95, 0.31, and 0.33, respectively, while other models yielded inferior results. Based on the relative importance scores, total dissolved solids (TDS), electrical conductivity (EC), sodium (Na+), chloride (Cl−), and calcium (Ca2+) contributed significantly to model performance. High variability in the target (fluoride) and predictor variables might have led to the poor performance of the models, implying the need for better data pre-processing techniques to improve data quality. Although ELM showed satisfactory results, it can be considered a promising model for predicting groundwater quality.","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":" 25","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fsoil.2024.1407502","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}

引用次数: 0

Abstract

Rising fluoride levels in groundwater resources have become a worldwide concern, presenting a significant challenge to the safe utilization of water resources and posing potential risks to human well-being. Elevated fluoride and its vast spatial variability have been documented across different districts of Punjab, India, and it is, therefore, imperative to predict the fluoride levels for efficient groundwater resources planning and management.In this study, five different models, Support Vector Machine (SVM), Random Forest (RF), Extreme Gradient Boosting (Xgboost), Extreme Learning Machine (ELM), and Multilayer Perceptron (MLP), are proposed to predict groundwater fluoride using the physicochemical parameters and sampling depth as predictor variables. The performance of these five models was evaluated using the coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE).ELM outperformed the remaining four models, thus exhibiting a strong predictive power. The R2, MAE, and RMSE values for ELM at the training and testing stages were 0.85, 0.46, 0.36 and, 0.95, 0.31, and 0.33, respectively, while other models yielded inferior results. Based on the relative importance scores, total dissolved solids (TDS), electrical conductivity (EC), sodium (Na+), chloride (Cl−), and calcium (Ca2+) contributed significantly to model performance. High variability in the target (fluoride) and predictor variables might have led to the poor performance of the models, implying the need for better data pre-processing techniques to improve data quality. Although ELM showed satisfactory results, it can be considered a promising model for predicting groundwater quality.

查看原文本刊更多论文

利用物理化学参数建立印度旁遮普省地下水氟化物预测模型：一种机器学习方法

地下水资源中氟化物含量的上升已成为全世界关注的问题，这对水资源的安全利用提出了重大挑战，并对人类福祉构成了潜在风险。在印度旁遮普省的不同地区，氟含量升高及其巨大的空间变化已被记录在案，因此，预测氟含量以进行有效的地下水资源规划和管理势在必行。本研究提出了支持向量机 (SVM)、随机森林 (RF)、极端梯度提升 (Xgboost)、极端学习机 (ELM) 和多层感知器 (MLP) 五种不同的模型，以物理化学参数和采样深度作为预测变量来预测地下水含氟量。使用判定系数（R2）、平均绝对误差（MAE）和均方根误差（RMSE）对这五个模型的性能进行了评估。在训练和测试阶段，ELM 的 R2、MAE 和 RMSE 值分别为 0.85、0.46、0.36 和 0.95、0.31 和 0.33，而其他模型的结果较差。根据相对重要性得分，总溶解固体（TDS）、电导率（EC）、钠（Na+）、氯（Cl-）和钙（Ca2+）对模型性能的贡献较大。目标变量（氟化物）和预测变量的高变异性可能导致模型性能不佳，这意味着需要更好的数据预处理技术来提高数据质量。尽管 ELM 显示出令人满意的结果，但它仍可被视为预测地下水质量的一种有前途的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACS Applied Bio Materials Chemistry-Chemistry (all)

CiteScore

9.40

自引率

2.10%

发文量

464

期刊介绍： ACS Applied Bio Materials is an interdisciplinary journal publishing original research covering all aspects of biomaterials and biointerfaces including and beyond the traditional biosensing, biomedical and therapeutic applications. The journal is devoted to reports of new and original experimental and theoretical research of an applied nature that integrates knowledge in the areas of materials, engineering, physics, bioscience, and chemistry into important bio applications. The journal is specifically interested in work that addresses the relationship between structure and function and assesses the stability and degradation of materials under relevant environmental and biological conditions.