整合机器学习模型,通过预测印度恒河下游硝酸盐- n浓度来优化生态系统健康评估。

IF 5.8 3区 环境科学与生态学 0 ENVIRONMENTAL SCIENCES
Basanta Kumar Das, Sanatan Paul, Biswajit Mandal, Pranab Gogoi, Liton Paul, Ajoy Saha, Canciyal Johnson, Akankshya Das, Archisman Ray, Shreya Roy, Shubhadeep Das Gupta
{"title":"整合机器学习模型,通过预测印度恒河下游硝酸盐- n浓度来优化生态系统健康评估。","authors":"Basanta Kumar Das,&nbsp;Sanatan Paul,&nbsp;Biswajit Mandal,&nbsp;Pranab Gogoi,&nbsp;Liton Paul,&nbsp;Ajoy Saha,&nbsp;Canciyal Johnson,&nbsp;Akankshya Das,&nbsp;Archisman Ray,&nbsp;Shreya Roy,&nbsp;Shubhadeep Das Gupta","doi":"10.1007/s11356-025-35999-z","DOIUrl":null,"url":null,"abstract":"<div><p>Nitrate, a highly reactive form of inorganic nitrogen, is commonly found in aquatic environments. Understanding the dynamics of nitrate–N concentration in rivers and its interactions with other water-quality parameters is crucial for effective freshwater ecosystem management. This study uses advanced machine learning models to analyse water quality parameters and predict nitrate–N concentrations in the lower stretch of the Ganga River from the observations of six annual periods (2017 to 2022). The parameters include water temperature, pH, specific conductivity (Sp_Con), dissolved oxygen (DO), nitrate–N, total phosphate (TP), turbidity, biochemical oxygen demand (BOD), silicate, total dissolved solids (TDS), and rainfall. The present study evaluated the predictive performance of five models—Multiple Polynomial Regression (MPR), Generalized Additive Models (GAMs), Decision Tree Regression, Random Forest (RF), and XGBoost (Extreme Gradient Boosting)—using RMSE, MAE, MAPE, NSE and R<sup>2</sup> metrics. XGBoost emerged as the top performer, with an RMSE of 0.024, MAE of 0.018, MAPE of 51.805, NSE of 0.855 and R<sup>2</sup> of 0.85, explaining 85% of the variance in nitrate–N concentrations. Random Forest also demonstrated strong predictive capability, with an RMSE of 0.028, MAE of 0.021, MAPE of 57.272, NSE of 0.804 and R<sup>2</sup> of 0.80. MPR effectively modelled non-linear relationships, explaining 75% of the variance, while Decision Tree Regression and GAMs were less effective, with R<sup>2</sup> values of 0.60 and 0.48, respectively. Variables (BOD, pH, Rainfall, water temperature, and total phosphate) were the best predictors of nitrate–N dynamics. Comparative analysis with previous studies confirmed the robustness of XGBoost and Random Forest in environmental data modelling. The findings highlight the importance of advanced machine learning models in accurately predicting water quality parameters and facilitating proactive management strategies.</p></div>","PeriodicalId":545,"journal":{"name":"Environmental Science and Pollution Research","volume":"32 8","pages":"4670 - 4689"},"PeriodicalIF":5.8000,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integrating machine learning models for optimizing ecosystem health assessments through prediction of nitrate–N concentrations in the lower stretch of Ganga River, India\",\"authors\":\"Basanta Kumar Das,&nbsp;Sanatan Paul,&nbsp;Biswajit Mandal,&nbsp;Pranab Gogoi,&nbsp;Liton Paul,&nbsp;Ajoy Saha,&nbsp;Canciyal Johnson,&nbsp;Akankshya Das,&nbsp;Archisman Ray,&nbsp;Shreya Roy,&nbsp;Shubhadeep Das Gupta\",\"doi\":\"10.1007/s11356-025-35999-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Nitrate, a highly reactive form of inorganic nitrogen, is commonly found in aquatic environments. Understanding the dynamics of nitrate–N concentration in rivers and its interactions with other water-quality parameters is crucial for effective freshwater ecosystem management. This study uses advanced machine learning models to analyse water quality parameters and predict nitrate–N concentrations in the lower stretch of the Ganga River from the observations of six annual periods (2017 to 2022). The parameters include water temperature, pH, specific conductivity (Sp_Con), dissolved oxygen (DO), nitrate–N, total phosphate (TP), turbidity, biochemical oxygen demand (BOD), silicate, total dissolved solids (TDS), and rainfall. The present study evaluated the predictive performance of five models—Multiple Polynomial Regression (MPR), Generalized Additive Models (GAMs), Decision Tree Regression, Random Forest (RF), and XGBoost (Extreme Gradient Boosting)—using RMSE, MAE, MAPE, NSE and R<sup>2</sup> metrics. XGBoost emerged as the top performer, with an RMSE of 0.024, MAE of 0.018, MAPE of 51.805, NSE of 0.855 and R<sup>2</sup> of 0.85, explaining 85% of the variance in nitrate–N concentrations. Random Forest also demonstrated strong predictive capability, with an RMSE of 0.028, MAE of 0.021, MAPE of 57.272, NSE of 0.804 and R<sup>2</sup> of 0.80. MPR effectively modelled non-linear relationships, explaining 75% of the variance, while Decision Tree Regression and GAMs were less effective, with R<sup>2</sup> values of 0.60 and 0.48, respectively. Variables (BOD, pH, Rainfall, water temperature, and total phosphate) were the best predictors of nitrate–N dynamics. Comparative analysis with previous studies confirmed the robustness of XGBoost and Random Forest in environmental data modelling. The findings highlight the importance of advanced machine learning models in accurately predicting water quality parameters and facilitating proactive management strategies.</p></div>\",\"PeriodicalId\":545,\"journal\":{\"name\":\"Environmental Science and Pollution Research\",\"volume\":\"32 8\",\"pages\":\"4670 - 4689\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2025-01-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Science and Pollution Research\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s11356-025-35999-z\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Science and Pollution Research","FirstCategoryId":"93","ListUrlMain":"https://link.springer.com/article/10.1007/s11356-025-35999-z","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

硝酸盐是无机氮的一种高活性形式,通常存在于水生环境中。了解河流中硝酸盐- n浓度的动态及其与其他水质参数的相互作用对于有效的淡水生态系统管理至关重要。本研究使用先进的机器学习模型来分析水质参数,并根据六个年度(2017年至2022年)的观测结果预测恒河下游的硝酸盐- n浓度。参数包括水温、pH、比电导率(Sp_Con)、溶解氧(DO)、硝酸盐- n、总磷酸盐(TP)、浊度、生化需氧量(BOD)、硅酸盐、总溶解固体(TDS)和降雨量。本研究使用RMSE、MAE、MAPE、NSE和R2指标评估了五种模型的预测性能——多元多项式回归(MPR)、广义加性模型(GAMs)、决策树回归、随机森林(RF)和XGBoost(极端梯度增强)。XGBoost表现最好,RMSE为0.024,MAE为0.018,MAPE为51.805,NSE为0.855,R2为0.85,解释了85%的硝酸盐- n浓度方差。随机森林也表现出较强的预测能力,RMSE为0.028,MAE为0.021,MAPE为57.272,NSE为0.804,R2为0.80。MPR有效地模拟了非线性关系,解释了75%的方差,而决策树回归和GAMs效果较差,R2值分别为0.60和0.48。变量(BOD, pH,降雨量,水温和总磷酸盐)是硝酸盐- n动态的最佳预测因子。与以往研究的对比分析证实了XGBoost和Random Forest在环境数据建模中的稳健性。研究结果强调了先进的机器学习模型在准确预测水质参数和促进主动管理策略方面的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Integrating machine learning models for optimizing ecosystem health assessments through prediction of nitrate–N concentrations in the lower stretch of Ganga River, India

Nitrate, a highly reactive form of inorganic nitrogen, is commonly found in aquatic environments. Understanding the dynamics of nitrate–N concentration in rivers and its interactions with other water-quality parameters is crucial for effective freshwater ecosystem management. This study uses advanced machine learning models to analyse water quality parameters and predict nitrate–N concentrations in the lower stretch of the Ganga River from the observations of six annual periods (2017 to 2022). The parameters include water temperature, pH, specific conductivity (Sp_Con), dissolved oxygen (DO), nitrate–N, total phosphate (TP), turbidity, biochemical oxygen demand (BOD), silicate, total dissolved solids (TDS), and rainfall. The present study evaluated the predictive performance of five models—Multiple Polynomial Regression (MPR), Generalized Additive Models (GAMs), Decision Tree Regression, Random Forest (RF), and XGBoost (Extreme Gradient Boosting)—using RMSE, MAE, MAPE, NSE and R2 metrics. XGBoost emerged as the top performer, with an RMSE of 0.024, MAE of 0.018, MAPE of 51.805, NSE of 0.855 and R2 of 0.85, explaining 85% of the variance in nitrate–N concentrations. Random Forest also demonstrated strong predictive capability, with an RMSE of 0.028, MAE of 0.021, MAPE of 57.272, NSE of 0.804 and R2 of 0.80. MPR effectively modelled non-linear relationships, explaining 75% of the variance, while Decision Tree Regression and GAMs were less effective, with R2 values of 0.60 and 0.48, respectively. Variables (BOD, pH, Rainfall, water temperature, and total phosphate) were the best predictors of nitrate–N dynamics. Comparative analysis with previous studies confirmed the robustness of XGBoost and Random Forest in environmental data modelling. The findings highlight the importance of advanced machine learning models in accurately predicting water quality parameters and facilitating proactive management strategies.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
8.70
自引率
17.20%
发文量
6549
审稿时长
3.8 months
期刊介绍: Environmental Science and Pollution Research (ESPR) serves the international community in all areas of Environmental Science and related subjects with emphasis on chemical compounds. This includes: - Terrestrial Biology and Ecology - Aquatic Biology and Ecology - Atmospheric Chemistry - Environmental Microbiology/Biobased Energy Sources - Phytoremediation and Ecosystem Restoration - Environmental Analyses and Monitoring - Assessment of Risks and Interactions of Pollutants in the Environment - Conservation Biology and Sustainable Agriculture - Impact of Chemicals/Pollutants on Human and Animal Health It reports from a broad interdisciplinary outlook.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信