利用隐藏模式提取和可解释集成学习改进有限数据下的长期水质预测

IF 6.3 2区 工程技术 Q1 ENGINEERING, CHEMICAL
Mehdi Mohammadi Ghaleni , Mansour Moradi , Mahnoosh Moghaddasi , Mojtaba Poursaeid , Mahmood Sadat-Noori
{"title":"利用隐藏模式提取和可解释集成学习改进有限数据下的长期水质预测","authors":"Mehdi Mohammadi Ghaleni ,&nbsp;Mansour Moradi ,&nbsp;Mahnoosh Moghaddasi ,&nbsp;Mojtaba Poursaeid ,&nbsp;Mahmood Sadat-Noori","doi":"10.1016/j.jwpe.2025.107946","DOIUrl":null,"url":null,"abstract":"<div><div>This study focuses on enhancing long-term, multi-step forecasting of dissolved oxygen (DO), a key indicator of river water quality. We introduce a novel hybrid method, Hidden Pattern Feature Extraction–Statistical Mode Decomposition (HPFE–SMD), integrated with explainable ensemble learning models, namely Random Forest (RF) and Extra Trees Regressor (ETR), both in standalone and hybrid configurations (HPFE-RF and HPFE-ETR). The models were trained and evaluated using monthly DO data spanning 1974–2023 from two sites within the Mississippi River Basin, across forecasting horizons of 1, 3, 9, and 15 months. The hybrid models consistently outperformed their standalone counterparts. For instance, at a 15-month horizon for Site 1, the HPFE-ETR model reduced the Mean Absolute Error (MAE) by 98.1 % compared to standalone ETR. In comparison with TVF-EMD-based models, HPFE-SMD achieved a 10.8 % and 4.3 % reduction in Mean Absolute Percentage Error (MAPE) for RF and ETR, respectively, at the 9-month horizon. Overall, HPFE-RF and HPFE-ETR achieved high predictive performance with RMSE values below 0.25 mg/L and R<sup>2</sup> values exceeding 0.99, even for long-term forecasts. SHAP (SHapley Additive exPlanations) analysis revealed that key statistical features, such as vibration amplitude (RMS), energy, skewness, kurtosis, and crest factor, played a dominant role in model predictions. Additionally, the proposed method demonstrated strong generalizability by accurately forecasting other water quality parameters, including total nitrogen, pH, total dissolved solids, and sodium adsorption ratio. These results highlight the added value of the HPFE-SMD approach over traditional decomposition or standalone ML models, showcasing its potential for integration into advanced water quality monitoring and management systems.</div></div>","PeriodicalId":17528,"journal":{"name":"Journal of water process engineering","volume":"75 ","pages":"Article 107946"},"PeriodicalIF":6.3000,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving long-term water quality forecasting with limited data using hidden pattern extraction and explainable ensemble learning\",\"authors\":\"Mehdi Mohammadi Ghaleni ,&nbsp;Mansour Moradi ,&nbsp;Mahnoosh Moghaddasi ,&nbsp;Mojtaba Poursaeid ,&nbsp;Mahmood Sadat-Noori\",\"doi\":\"10.1016/j.jwpe.2025.107946\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study focuses on enhancing long-term, multi-step forecasting of dissolved oxygen (DO), a key indicator of river water quality. We introduce a novel hybrid method, Hidden Pattern Feature Extraction–Statistical Mode Decomposition (HPFE–SMD), integrated with explainable ensemble learning models, namely Random Forest (RF) and Extra Trees Regressor (ETR), both in standalone and hybrid configurations (HPFE-RF and HPFE-ETR). The models were trained and evaluated using monthly DO data spanning 1974–2023 from two sites within the Mississippi River Basin, across forecasting horizons of 1, 3, 9, and 15 months. The hybrid models consistently outperformed their standalone counterparts. For instance, at a 15-month horizon for Site 1, the HPFE-ETR model reduced the Mean Absolute Error (MAE) by 98.1 % compared to standalone ETR. In comparison with TVF-EMD-based models, HPFE-SMD achieved a 10.8 % and 4.3 % reduction in Mean Absolute Percentage Error (MAPE) for RF and ETR, respectively, at the 9-month horizon. Overall, HPFE-RF and HPFE-ETR achieved high predictive performance with RMSE values below 0.25 mg/L and R<sup>2</sup> values exceeding 0.99, even for long-term forecasts. SHAP (SHapley Additive exPlanations) analysis revealed that key statistical features, such as vibration amplitude (RMS), energy, skewness, kurtosis, and crest factor, played a dominant role in model predictions. Additionally, the proposed method demonstrated strong generalizability by accurately forecasting other water quality parameters, including total nitrogen, pH, total dissolved solids, and sodium adsorption ratio. These results highlight the added value of the HPFE-SMD approach over traditional decomposition or standalone ML models, showcasing its potential for integration into advanced water quality monitoring and management systems.</div></div>\",\"PeriodicalId\":17528,\"journal\":{\"name\":\"Journal of water process engineering\",\"volume\":\"75 \",\"pages\":\"Article 107946\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of water process engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2214714425010189\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CHEMICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of water process engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214714425010189","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0

摘要

溶解氧(DO)是河流水质的重要指标,本研究的重点是加强对溶解氧的长期、多步骤预测。我们介绍了一种新的混合方法,隐藏模式特征提取-统计模式分解(hfe - smd),集成了可解释的集成学习模型,即随机森林(RF)和额外树回归(ETR),包括独立配置和混合配置(hfe -RF和hfe -ETR)。使用1974-2023年密西西比河流域两个地点的每月DO数据对模型进行了训练和评估,预测范围为1、3、9和15个月。混合模型的表现始终优于独立模型。例如,在Site 1的15个月期限内,与独立ETR相比,hfe -ETR模型将平均绝对误差(MAE)降低了98.1%。与基于tvf - emd的模型相比,hfe - smd在9个月的时间内,RF和ETR的平均绝对百分比误差(MAPE)分别降低了10.8%和4.3%。总体而言,即使是长期预测,hfe - rf和hfe - etr也取得了较高的预测性能,RMSE值低于0.25 mg/L, R2值超过0.99。SHapley加性解释(SHapley Additive exPlanations)分析表明,振动振幅、能量、偏度、峰度和波峰因子等关键统计特征在模型预测中起主导作用。此外,该方法可以准确预测其他水质参数,包括总氮、pH、总溶解固形物和钠吸附比,具有很强的通用性。这些结果突出了hfe - smd方法相对于传统分解或独立ML模型的附加价值,展示了其集成到先进水质监测和管理系统中的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improving long-term water quality forecasting with limited data using hidden pattern extraction and explainable ensemble learning
This study focuses on enhancing long-term, multi-step forecasting of dissolved oxygen (DO), a key indicator of river water quality. We introduce a novel hybrid method, Hidden Pattern Feature Extraction–Statistical Mode Decomposition (HPFE–SMD), integrated with explainable ensemble learning models, namely Random Forest (RF) and Extra Trees Regressor (ETR), both in standalone and hybrid configurations (HPFE-RF and HPFE-ETR). The models were trained and evaluated using monthly DO data spanning 1974–2023 from two sites within the Mississippi River Basin, across forecasting horizons of 1, 3, 9, and 15 months. The hybrid models consistently outperformed their standalone counterparts. For instance, at a 15-month horizon for Site 1, the HPFE-ETR model reduced the Mean Absolute Error (MAE) by 98.1 % compared to standalone ETR. In comparison with TVF-EMD-based models, HPFE-SMD achieved a 10.8 % and 4.3 % reduction in Mean Absolute Percentage Error (MAPE) for RF and ETR, respectively, at the 9-month horizon. Overall, HPFE-RF and HPFE-ETR achieved high predictive performance with RMSE values below 0.25 mg/L and R2 values exceeding 0.99, even for long-term forecasts. SHAP (SHapley Additive exPlanations) analysis revealed that key statistical features, such as vibration amplitude (RMS), energy, skewness, kurtosis, and crest factor, played a dominant role in model predictions. Additionally, the proposed method demonstrated strong generalizability by accurately forecasting other water quality parameters, including total nitrogen, pH, total dissolved solids, and sodium adsorption ratio. These results highlight the added value of the HPFE-SMD approach over traditional decomposition or standalone ML models, showcasing its potential for integration into advanced water quality monitoring and management systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of water process engineering
Journal of water process engineering Biochemistry, Genetics and Molecular Biology-Biotechnology
CiteScore
10.70
自引率
8.60%
发文量
846
审稿时长
24 days
期刊介绍: The Journal of Water Process Engineering aims to publish refereed, high-quality research papers with significant novelty and impact in all areas of the engineering of water and wastewater processing . Papers on advanced and novel treatment processes and technologies are particularly welcome. The Journal considers papers in areas such as nanotechnology and biotechnology applications in water, novel oxidation and separation processes, membrane processes (except those for desalination) , catalytic processes for the removal of water contaminants, sustainable processes, water reuse and recycling, water use and wastewater minimization, integrated/hybrid technology, process modeling of water treatment and novel treatment processes. Submissions on the subject of adsorbents, including standard measurements of adsorption kinetics and equilibrium will only be considered if there is a genuine case for novelty and contribution, for example highly novel, sustainable adsorbents and their use: papers on activated carbon-type materials derived from natural matter, or surfactant-modified clays and related minerals, would not fulfil this criterion. The Journal particularly welcomes contributions involving environmentally, economically and socially sustainable technology for water treatment, including those which are energy-efficient, with minimal or no chemical consumption, and capable of water recycling and reuse that minimizes the direct disposal of wastewater to the aquatic environment. Papers that describe novel ideas for solving issues related to water quality and availability are also welcome, as are those that show the transfer of techniques from other disciplines. The Journal will consider papers dealing with processes for various water matrices including drinking water (except desalination), domestic, urban and industrial wastewaters, in addition to their residues. It is expected that the journal will be of particular relevance to chemical and process engineers working in the field. The Journal welcomes Full Text papers, Short Communications, State-of-the-Art Reviews and Letters to Editors and Case Studies
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信