{"title":"AlgAlert: A two-level approach for algae bloom prediction using deep learning","authors":"Areej Alsini , Amina Saeed , Dawood Amin","doi":"10.1016/j.ecoinf.2025.103260","DOIUrl":null,"url":null,"abstract":"<div><div>Chlorophyll-a (Chl-a) is essential to detect harmful algae blooms that can damage aquatic ecosystems and cause economic losses. Consequently, governmental agencies and research institutions invest significant effort into monitoring water quality and developing management strategies for aquatic systems. With the increasing availability of real-time water quality, meteorological and tidal sensor data, there is growing potential to harness this information through data-driven approaches such as machine learning to support aquatic systems management. This study presents a comprehensive data-driven framework named AlgaAlert that integrates a regression model to forecast Chl-a concentrations and a classification model to predict the occurrence of blooms in a temperate estuarine system. The framework was developed by benchmarking multiple algorithms and selecting the best-performing regression and classification models for integration. The model evaluation was based on hourly water quality and meteorological data collected from early December 2019 to mid-January 2020 from the Kwilena monitoring site, the South Perth meteorological station, and a tidal gauge on Barrack Street, Perth, Australia. The AlgAlert framework combines K-Nearest-Neighbours Regression (KNN) regression to predict Chl-a levels with a custom classifier to determine bloom or no-bloom conditions based on labelled time-series data. KNN demonstrated superior regression performance, achieving 0.25 MAE, outperforming other models like random forest (RF). Classification results revealed nearly perfect F1-scores, indicating that the model accurately identified bloom events with few missed or false alarms (0.99 for no-bloom and 0.98 for bloom). This demonstrates AlgAlert’s robust predictive capabilities, offering a reliable tool to support timely decision-making in water quality management.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"90 ","pages":"Article 103260"},"PeriodicalIF":7.3000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954125002699","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Chlorophyll-a (Chl-a) is essential to detect harmful algae blooms that can damage aquatic ecosystems and cause economic losses. Consequently, governmental agencies and research institutions invest significant effort into monitoring water quality and developing management strategies for aquatic systems. With the increasing availability of real-time water quality, meteorological and tidal sensor data, there is growing potential to harness this information through data-driven approaches such as machine learning to support aquatic systems management. This study presents a comprehensive data-driven framework named AlgaAlert that integrates a regression model to forecast Chl-a concentrations and a classification model to predict the occurrence of blooms in a temperate estuarine system. The framework was developed by benchmarking multiple algorithms and selecting the best-performing regression and classification models for integration. The model evaluation was based on hourly water quality and meteorological data collected from early December 2019 to mid-January 2020 from the Kwilena monitoring site, the South Perth meteorological station, and a tidal gauge on Barrack Street, Perth, Australia. The AlgAlert framework combines K-Nearest-Neighbours Regression (KNN) regression to predict Chl-a levels with a custom classifier to determine bloom or no-bloom conditions based on labelled time-series data. KNN demonstrated superior regression performance, achieving 0.25 MAE, outperforming other models like random forest (RF). Classification results revealed nearly perfect F1-scores, indicating that the model accurately identified bloom events with few missed or false alarms (0.99 for no-bloom and 0.98 for bloom). This demonstrates AlgAlert’s robust predictive capabilities, offering a reliable tool to support timely decision-making in water quality management.
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.