Caicai Xu , Yuzhou Huang , Ruoxue Xin , Na Wu , Muyuan Liu
{"title":"利用信号处理预测藻华:来自集成学习的新视角","authors":"Caicai Xu , Yuzhou Huang , Ruoxue Xin , Na Wu , Muyuan Liu","doi":"10.1016/j.watres.2025.123800","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate forecasting of algal blooms is essential for implementing timely control measures. However, given their inherent complex time-frequency characteristics, capturing the dynamics of algal blooms remains an ongoing challenge in standalone models. Targeting this challenge, this study demonstrates an ensemble framework that combines signal processing with machine learning (ML) techniques to collectively forecast algal dynamics. This method utilizes an efficient signal processing algorithm, namely the compete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), to decompose the highly non-stationary patterns of algal dynamics, while leveraging the complementary strengths of four distinct ML models to optimize the learning of the decomposed components. Our results demonstrated that CEEMDAN can largely improve the forecasting performance of standalone ML models (e.g., long short-term memory), achieving an average increase in validation <em>R</em><sup>2</sup> by 63 %. Moreover, by incorporating the ensemble effects that leverage model-specific strengths, this performance gain was further amplified, resulting in an average increase of 75 % in validation <em>R</em><sup>2</sup> compared to standalone ML models. The developed method, termed CEEMDAN-Hybrid-Ensemble (CHES) model, consistently delivered accurate forecasting of algal dynamics across multiple time resolutions (hourly, daily, and biweekly) in both rivers (River Enborne and The Cut) and lakes (Blelham Tarn and Lake Lillinonah), as suggested by high validation <em>R</em><sup>2</sup> values of 0.955, 0.878, 0.824, and 0.957, respectively. In addition, the CHES model achieved stable multi-step forecasting of algal dynamics with gaps ranging from 1 to 7 steps, as indicated by an average validation <em>R</em><sup>2</sup> of 0.72 <span><math><mo>±</mo></math></span> 0.17 (S.D.) and an average validation root-mean-square-error (RMSE) of 0.32 <span><math><mo>±</mo></math></span> 0.11 RFU. This study highlighted the ensemble effect achieved by integrating signal processing and ML techniques, presenting a novel perspective that enhances forecasting robustness to support the early warning of algal blooms.</div></div>","PeriodicalId":443,"journal":{"name":"Water Research","volume":"283 ","pages":"Article 123800"},"PeriodicalIF":11.4000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Algal bloom forecasting leveraging signal processing: A novel perspective from ensemble learning\",\"authors\":\"Caicai Xu , Yuzhou Huang , Ruoxue Xin , Na Wu , Muyuan Liu\",\"doi\":\"10.1016/j.watres.2025.123800\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Accurate forecasting of algal blooms is essential for implementing timely control measures. However, given their inherent complex time-frequency characteristics, capturing the dynamics of algal blooms remains an ongoing challenge in standalone models. Targeting this challenge, this study demonstrates an ensemble framework that combines signal processing with machine learning (ML) techniques to collectively forecast algal dynamics. This method utilizes an efficient signal processing algorithm, namely the compete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), to decompose the highly non-stationary patterns of algal dynamics, while leveraging the complementary strengths of four distinct ML models to optimize the learning of the decomposed components. Our results demonstrated that CEEMDAN can largely improve the forecasting performance of standalone ML models (e.g., long short-term memory), achieving an average increase in validation <em>R</em><sup>2</sup> by 63 %. Moreover, by incorporating the ensemble effects that leverage model-specific strengths, this performance gain was further amplified, resulting in an average increase of 75 % in validation <em>R</em><sup>2</sup> compared to standalone ML models. The developed method, termed CEEMDAN-Hybrid-Ensemble (CHES) model, consistently delivered accurate forecasting of algal dynamics across multiple time resolutions (hourly, daily, and biweekly) in both rivers (River Enborne and The Cut) and lakes (Blelham Tarn and Lake Lillinonah), as suggested by high validation <em>R</em><sup>2</sup> values of 0.955, 0.878, 0.824, and 0.957, respectively. In addition, the CHES model achieved stable multi-step forecasting of algal dynamics with gaps ranging from 1 to 7 steps, as indicated by an average validation <em>R</em><sup>2</sup> of 0.72 <span><math><mo>±</mo></math></span> 0.17 (S.D.) and an average validation root-mean-square-error (RMSE) of 0.32 <span><math><mo>±</mo></math></span> 0.11 RFU. This study highlighted the ensemble effect achieved by integrating signal processing and ML techniques, presenting a novel perspective that enhances forecasting robustness to support the early warning of algal blooms.</div></div>\",\"PeriodicalId\":443,\"journal\":{\"name\":\"Water Research\",\"volume\":\"283 \",\"pages\":\"Article 123800\"},\"PeriodicalIF\":11.4000,\"publicationDate\":\"2025-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Water Research\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0043135425007092\",\"RegionNum\":1,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ENVIRONMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water Research","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0043135425007092","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
Algal bloom forecasting leveraging signal processing: A novel perspective from ensemble learning
Accurate forecasting of algal blooms is essential for implementing timely control measures. However, given their inherent complex time-frequency characteristics, capturing the dynamics of algal blooms remains an ongoing challenge in standalone models. Targeting this challenge, this study demonstrates an ensemble framework that combines signal processing with machine learning (ML) techniques to collectively forecast algal dynamics. This method utilizes an efficient signal processing algorithm, namely the compete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), to decompose the highly non-stationary patterns of algal dynamics, while leveraging the complementary strengths of four distinct ML models to optimize the learning of the decomposed components. Our results demonstrated that CEEMDAN can largely improve the forecasting performance of standalone ML models (e.g., long short-term memory), achieving an average increase in validation R2 by 63 %. Moreover, by incorporating the ensemble effects that leverage model-specific strengths, this performance gain was further amplified, resulting in an average increase of 75 % in validation R2 compared to standalone ML models. The developed method, termed CEEMDAN-Hybrid-Ensemble (CHES) model, consistently delivered accurate forecasting of algal dynamics across multiple time resolutions (hourly, daily, and biweekly) in both rivers (River Enborne and The Cut) and lakes (Blelham Tarn and Lake Lillinonah), as suggested by high validation R2 values of 0.955, 0.878, 0.824, and 0.957, respectively. In addition, the CHES model achieved stable multi-step forecasting of algal dynamics with gaps ranging from 1 to 7 steps, as indicated by an average validation R2 of 0.72 0.17 (S.D.) and an average validation root-mean-square-error (RMSE) of 0.32 0.11 RFU. This study highlighted the ensemble effect achieved by integrating signal processing and ML techniques, presenting a novel perspective that enhances forecasting robustness to support the early warning of algal blooms.
期刊介绍:
Water Research, along with its open access companion journal Water Research X, serves as a platform for publishing original research papers covering various aspects of the science and technology related to the anthropogenic water cycle, water quality, and its management worldwide. The audience targeted by the journal comprises biologists, chemical engineers, chemists, civil engineers, environmental engineers, limnologists, and microbiologists. The scope of the journal include:
•Treatment processes for water and wastewaters (municipal, agricultural, industrial, and on-site treatment), including resource recovery and residuals management;
•Urban hydrology including sewer systems, stormwater management, and green infrastructure;
•Drinking water treatment and distribution;
•Potable and non-potable water reuse;
•Sanitation, public health, and risk assessment;
•Anaerobic digestion, solid and hazardous waste management, including source characterization and the effects and control of leachates and gaseous emissions;
•Contaminants (chemical, microbial, anthropogenic particles such as nanoparticles or microplastics) and related water quality sensing, monitoring, fate, and assessment;
•Anthropogenic impacts on inland, tidal, coastal and urban waters, focusing on surface and ground waters, and point and non-point sources of pollution;
•Environmental restoration, linked to surface water, groundwater and groundwater remediation;
•Analysis of the interfaces between sediments and water, and between water and atmosphere, focusing specifically on anthropogenic impacts;
•Mathematical modelling, systems analysis, machine learning, and beneficial use of big data related to the anthropogenic water cycle;
•Socio-economic, policy, and regulations studies.