Comparative analysis of machine learning models for predicting water quality index in Dhaka’s rivers of Bangladesh

IF 6 3区 环境科学与生态学 Q1 ENVIRONMENTAL SCIENCES
Mosaraf Hosan Nishat, Md. Habibur Rahman Bejoy Khan, Tahmeed Ahmed, Syed Nahin Hossain, Amimul Ahsan, M. M. El-Sergany, Md. Shafiquzzaman, Monzur Alam Imteaz, Mohammad T. Alresheedi
{"title":"Comparative analysis of machine learning models for predicting water quality index in Dhaka’s rivers of Bangladesh","authors":"Mosaraf Hosan Nishat,&nbsp;Md. Habibur Rahman Bejoy Khan,&nbsp;Tahmeed Ahmed,&nbsp;Syed Nahin Hossain,&nbsp;Amimul Ahsan,&nbsp;M. M. El-Sergany,&nbsp;Md. Shafiquzzaman,&nbsp;Monzur Alam Imteaz,&nbsp;Mohammad T. Alresheedi","doi":"10.1186/s12302-025-01078-w","DOIUrl":null,"url":null,"abstract":"<div><p>The pollution in Dhaka's navigable waterways, including the Buriganga, Balu, Tongi Khal, and Turag rivers, is a significant concern due to rapid industrial and urban expansion. Industrial discharges, domestic sewage and inadequate waste management are the primary sources of this pollution, degrading water quality and threatening aquatic ecosystems. This study aimed to predict the Water Quality Index (WQI) of these rivers using fourteen machine learning (ML) models: Decision Tree Regression, Linear Regression, Ridge Regression, Stochastic Gradient Descent (SGD) Regressor, Extreme Gradient Boosting (XGB) Regressor, Light Gradient Boosting Machine (GBM) Regressor, Elastic Net Regressor, Support Vector Regression (SVM), Random Forest Regression, Bayesian Ridge Regressor, Artificial Neural Network (ANN), AdaBoost Regressor, CatBoost Regressor and Extra Trees Regressor. The objective was to evaluate and compare these models to identify the most effective predictive method for WQI, enabling efficient environmental monitoring and management of urban waterways. Among the evaluated ML models, ANN and Random Forest Regressor performed the best. The ANN model demonstrated superior predictive capability, achieving a Root Mean Squared Error (RMSE) of 2.34, a Mean Absolute Error (MAE) of 1.24, a Nash–Sutcliffe Efficiency (NSE) of 0.97, and a Coefficient of Determination (R<sup>2</sup>) of 0.97. Furthermore, an Adjusted <i>R</i><sup><i>2</i></sup> value of 0.965 further confirmed its ability to capture complex patterns in water quality data with remarkable accuracy. These findings emphasize the importance of using AI modeling techniques, specifically ANN and Random Forest Regression, to improve the accuracy of WQI forecasts for the waterways. This study contributes to the field of environmental science by offering a novel integration of feature selection techniques with ML models to enhance efficiency and cost-effectiveness of water quality monitoring. Unlike previous studies, this research specifically addresses the challenges of urban waterways in Dhaka, Bangladesh, a region significantly impacted by industrial and urban pollution. To our knowledge, this is the first study to apply such a comprehensive range of ML models to predict the WQI of Dhaka’s four major rivers. By providing a reliable methodology for WQI estimation, this study supports informed decision-making and proactive measures to protect vital water resources.</p></div>","PeriodicalId":546,"journal":{"name":"Environmental Sciences Europe","volume":"37 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1186/s12302-025-01078-w.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Sciences Europe","FirstCategoryId":"93","ListUrlMain":"https://link.springer.com/article/10.1186/s12302-025-01078-w","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

The pollution in Dhaka's navigable waterways, including the Buriganga, Balu, Tongi Khal, and Turag rivers, is a significant concern due to rapid industrial and urban expansion. Industrial discharges, domestic sewage and inadequate waste management are the primary sources of this pollution, degrading water quality and threatening aquatic ecosystems. This study aimed to predict the Water Quality Index (WQI) of these rivers using fourteen machine learning (ML) models: Decision Tree Regression, Linear Regression, Ridge Regression, Stochastic Gradient Descent (SGD) Regressor, Extreme Gradient Boosting (XGB) Regressor, Light Gradient Boosting Machine (GBM) Regressor, Elastic Net Regressor, Support Vector Regression (SVM), Random Forest Regression, Bayesian Ridge Regressor, Artificial Neural Network (ANN), AdaBoost Regressor, CatBoost Regressor and Extra Trees Regressor. The objective was to evaluate and compare these models to identify the most effective predictive method for WQI, enabling efficient environmental monitoring and management of urban waterways. Among the evaluated ML models, ANN and Random Forest Regressor performed the best. The ANN model demonstrated superior predictive capability, achieving a Root Mean Squared Error (RMSE) of 2.34, a Mean Absolute Error (MAE) of 1.24, a Nash–Sutcliffe Efficiency (NSE) of 0.97, and a Coefficient of Determination (R2) of 0.97. Furthermore, an Adjusted R2 value of 0.965 further confirmed its ability to capture complex patterns in water quality data with remarkable accuracy. These findings emphasize the importance of using AI modeling techniques, specifically ANN and Random Forest Regression, to improve the accuracy of WQI forecasts for the waterways. This study contributes to the field of environmental science by offering a novel integration of feature selection techniques with ML models to enhance efficiency and cost-effectiveness of water quality monitoring. Unlike previous studies, this research specifically addresses the challenges of urban waterways in Dhaka, Bangladesh, a region significantly impacted by industrial and urban pollution. To our knowledge, this is the first study to apply such a comprehensive range of ML models to predict the WQI of Dhaka’s four major rivers. By providing a reliable methodology for WQI estimation, this study supports informed decision-making and proactive measures to protect vital water resources.

由于工业和城市的快速扩张,达卡通航水道(包括布里甘加河、巴卢河、通吉卡尔河和图拉格河)的污染问题令人严重关切。工业排放物、生活污水和不适当的废物管理是这些污染的主要来源,导致水质恶化,威胁水生生态系统。本研究旨在使用十四种机器学习(ML)模型预测这些河流的水质指数(WQI):这些模型包括:决策树回归模型、线性回归模型、岭回归模型、随机梯度下降(SGD)回归模型、极端梯度提升(XGB)回归模型、光梯度提升机(GBM)回归模型、弹性网回归模型、支持向量回归模型(SVM)、随机森林回归模型、贝叶斯岭回归模型、人工神经网络(ANN)、AdaBoost 回归模型、CatBoost 回归模型和 Extra Trees 回归模型。目的是对这些模型进行评估和比较,以确定最有效的水质指数预测方法,从而对城市水道进行有效的环境监测和管理。在所评估的多重水文学模型中,ANN 和随机森林回归模型表现最佳。ANN 模型表现出卓越的预测能力,均方根误差 (RMSE) 为 2.34,平均绝对误差 (MAE) 为 1.24,纳什-苏特克利夫效率 (NSE) 为 0.97,决定系数 (R2) 为 0.97。此外,调整后的 R2 值为 0.965,进一步证实了其能够准确捕捉水质数据中的复杂模式。这些发现强调了使用人工智能建模技术(特别是 ANN 和随机森林回归)提高水道水质指数预测准确性的重要性。本研究将特征选择技术与 ML 模型进行了新颖的整合,以提高水质监测的效率和成本效益,从而为环境科学领域做出了贡献。与以往的研究不同,本研究专门针对孟加拉国达卡城市水道面临的挑战,该地区受到工业和城市污染的严重影响。据我们所知,这是第一项应用如此全面的 ML 模型来预测达卡四条主要河流水质指数的研究。通过提供可靠的水质指数估算方法,这项研究有助于做出明智的决策和采取积极措施来保护重要的水资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Environmental Sciences Europe
Environmental Sciences Europe Environmental Science-Pollution
CiteScore
11.20
自引率
1.70%
发文量
110
审稿时长
13 weeks
期刊介绍: ESEU is an international journal, focusing primarily on Europe, with a broad scope covering all aspects of environmental sciences, including the main topic regulation. ESEU will discuss the entanglement between environmental sciences and regulation because, in recent years, there have been misunderstandings and even disagreement between stakeholders in these two areas. ESEU will help to improve the comprehension of issues between environmental sciences and regulation. ESEU will be an outlet from the German-speaking (DACH) countries to Europe and an inlet from Europe to the DACH countries regarding environmental sciences and regulation. Moreover, ESEU will facilitate the exchange of ideas and interaction between Europe and the DACH countries regarding environmental regulatory issues. Although Europe is at the center of ESEU, the journal will not exclude the rest of the world, because regulatory issues pertaining to environmental sciences can be fully seen only from a global perspective.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信