Water quality prediction method based on a combined machine learning model: A case study of the Daling River Basin

IF 4.4 3区 环境科学与生态学 Q2 ENVIRONMENTAL SCIENCES
Yang Liu, Yingchun Wang
{"title":"Water quality prediction method based on a combined machine learning model: A case study of the Daling River Basin","authors":"Yang Liu,&nbsp;Yingchun Wang","doi":"10.1016/j.jconhyd.2025.104725","DOIUrl":null,"url":null,"abstract":"<div><div>Total nitrogen (TN) is a key factor limiting river water quality. Identifying its spatial-temporal variation and influencing factors is essential for predicting pollution trends and mitigating water quality risks. However, challenges such as insufficient real-time monitoring, limited spatiotemporal modeling, and complex feature selection for multi-source data remain. This study proposes a machine learning method combining spatiotemporal weighted interpolation, relevant feature selection, and time-series decomposition to analyze TN concentration variations in the Daling River Basin (2023–2025) using data from seven monitoring stations. The Enhanced Long Short-Term Memory with Back Propagation Network (ELSTM-EBP) model is developed to assess TN spatial-temporal fluctuations and driving factors. Results show that: (1) TN concentrations are highest in January and February, with a “U”-shaped fluctuation over the year, and lowest in August; (2) Significant spatial heterogeneity is observed at the seven monitoring points; (3) Water temperature negatively correlates with TN, while dissolved oxygen positively correlates; other factors such as the permanganate index and turbidity also significantly influence TN levels; (4) The ELSTM-EBP model outperforms other models (ELSTM-EBP &gt; QLSTM &gt; LSTM &gt; GRU-QIMAS &gt; EQINN &gt; BP) in TN prediction accuracy and generalization ability; (5) Multi-step prediction accuracy decreases slightly with step length, but remains within −0.4 to 0.4 mg/L for up to 7 steps, indicating robust performance for short-term predictions.</div></div>","PeriodicalId":15530,"journal":{"name":"Journal of contaminant hydrology","volume":"276 ","pages":"Article 104725"},"PeriodicalIF":4.4000,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of contaminant hydrology","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S016977222500230X","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Total nitrogen (TN) is a key factor limiting river water quality. Identifying its spatial-temporal variation and influencing factors is essential for predicting pollution trends and mitigating water quality risks. However, challenges such as insufficient real-time monitoring, limited spatiotemporal modeling, and complex feature selection for multi-source data remain. This study proposes a machine learning method combining spatiotemporal weighted interpolation, relevant feature selection, and time-series decomposition to analyze TN concentration variations in the Daling River Basin (2023–2025) using data from seven monitoring stations. The Enhanced Long Short-Term Memory with Back Propagation Network (ELSTM-EBP) model is developed to assess TN spatial-temporal fluctuations and driving factors. Results show that: (1) TN concentrations are highest in January and February, with a “U”-shaped fluctuation over the year, and lowest in August; (2) Significant spatial heterogeneity is observed at the seven monitoring points; (3) Water temperature negatively correlates with TN, while dissolved oxygen positively correlates; other factors such as the permanganate index and turbidity also significantly influence TN levels; (4) The ELSTM-EBP model outperforms other models (ELSTM-EBP > QLSTM > LSTM > GRU-QIMAS > EQINN > BP) in TN prediction accuracy and generalization ability; (5) Multi-step prediction accuracy decreases slightly with step length, but remains within −0.4 to 0.4 mg/L for up to 7 steps, indicating robust performance for short-term predictions.
基于组合机器学习模型的水质预测方法——以大陵江流域为例。
总氮(TN)是制约河流水质的关键因子。识别其时空变化及其影响因素对预测污染趋势和减轻水质风险至关重要。然而,实时监测不足、时空建模受限、多源数据特征选择复杂等挑战依然存在。利用7个监测站数据,采用时空加权插值、相关特征选择和时间序列分解相结合的机器学习方法,分析了2023-2025年大陵江流域TN浓度变化。建立了基于反向传播网络的增强长短期记忆(ELSTM-EBP)模型来评估TN的时空波动及其驱动因素。结果表明:(1)总氮浓度在1月和2月最高,呈“U”型波动,8月最低;(2) 7个监测点的空间异质性显著;(3)水温与TN呈负相关,溶解氧与TN呈正相关;其他因素如高锰酸盐指数和浊度也显著影响TN水平;(4) ELSTM-EBP模型在TN预测精度和泛化能力上优于其他模型(ELSTM-EBP > QLSTM > LSTM > GRU-QIMAS > EQINN > BP);(5)多步预测精度随步长略有下降,但在7步内仍保持在-0.4 ~ 0.4 mg/L范围内,具有较好的短期预测效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of contaminant hydrology
Journal of contaminant hydrology 环境科学-地球科学综合
CiteScore
6.80
自引率
2.80%
发文量
129
审稿时长
68 days
期刊介绍: The Journal of Contaminant Hydrology is an international journal publishing scientific articles pertaining to the contamination of subsurface water resources. Emphasis is placed on investigations of the physical, chemical, and biological processes influencing the behavior and fate of organic and inorganic contaminants in the unsaturated (vadose) and saturated (groundwater) zones, as well as at groundwater-surface water interfaces. The ecological impacts of contaminants transported both from and to aquifers are of interest. Articles on contamination of surface water only, without a link to groundwater, are out of the scope. Broad latitude is allowed in identifying contaminants of interest, and include legacy and emerging pollutants, nutrients, nanoparticles, pathogenic microorganisms (e.g., bacteria, viruses, protozoa), microplastics, and various constituents associated with energy production (e.g., methane, carbon dioxide, hydrogen sulfide). The journal''s scope embraces a wide range of topics including: experimental investigations of contaminant sorption, diffusion, transformation, volatilization and transport in the surface and subsurface; characterization of soil and aquifer properties only as they influence contaminant behavior; development and testing of mathematical models of contaminant behaviour; innovative techniques for restoration of contaminated sites; development of new tools or techniques for monitoring the extent of soil and groundwater contamination; transformation of contaminants in the hyporheic zone; effects of contaminants traversing the hyporheic zone on surface water and groundwater ecosystems; subsurface carbon sequestration and/or turnover; and migration of fluids associated with energy production into groundwater.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信