Spatiotemporal dynamics of cyanobacterial blooms: Integrating machine learning and feature selection techniques with uncrewed aircraft systems and autonomous surface vessel data

IF 8.4 2区 环境科学与生态学 Q1 ENVIRONMENTAL SCIENCES
Mohammed Shakiul Islam , Padmanava Dash , John P. Liles , Hafez Ahmad , Abduselam M. Nur , Rajendra M. Panda , Jessica S. Wolfe , Gray Turnage , Lee Hathcock , Gary D. Chesser , Robert J. Moorhead
{"title":"Spatiotemporal dynamics of cyanobacterial blooms: Integrating machine learning and feature selection techniques with uncrewed aircraft systems and autonomous surface vessel data","authors":"Mohammed Shakiul Islam ,&nbsp;Padmanava Dash ,&nbsp;John P. Liles ,&nbsp;Hafez Ahmad ,&nbsp;Abduselam M. Nur ,&nbsp;Rajendra M. Panda ,&nbsp;Jessica S. Wolfe ,&nbsp;Gray Turnage ,&nbsp;Lee Hathcock ,&nbsp;Gary D. Chesser ,&nbsp;Robert J. Moorhead","doi":"10.1016/j.jenvman.2025.124878","DOIUrl":null,"url":null,"abstract":"<div><div>Cyanobacterial blooms pose significant threats to aquatic ecosystems and public health due to their ability to release harmful toxins, degrade water quality, disrupt aquatic habitats, and endanger human and animal health through contact or consumption of contaminated water. Monitoring phycocyanin (PC), a pigment unique to cyanobacteria, offers a reliable method for detecting and quantifying these blooms, enabling timely interventions to mitigate their impacts. This study aimed to evaluate ten machine learning algorithms (MLAs) for assessing the spatiotemporal variations of cyanobacterial concentrations over an oyster reef in the Western Mississippi Sound (WMS) using remotely sensed imagery from uncrewed aircraft systems (UAS) and in-situ PC concentrations measured by an autonomous surface vessel (ASV). The study further investigated the influence of river discharge and climatic variables on cyanobacterial concentrations using a time-series of cyanobacteria maps. To derive the most accurate PC retrieval model, a comprehensive set of 85 features was initially generated, including individual spectral bands, band ratios, multiple vegetation indices, and three-band indices. Feature selection was performed using a two-step approach that combined Sequential Backward Floating Selection (SBFS) and Exhaustive Feature Selection (EFS). SBFS was first used to iteratively remove features and optimize model performance, while EFS evaluated all possible combinations of the features identified by SBFS to select the best subset. Among the ten MLAs tested, Extreme Gradient Boosting emerged as the top-performing model, achieving an R<sup>2</sup> of 0.835, a root mean square deviation of 0.419 μg/l, an unbiased mean absolute relative difference of 0.176 μg/l, and an average percentage difference of 18.072 % in retrieving PC concentration. The novelty of this study lies in its data-driven approach to identifying the most suitable machine learning algorithm and feature subsets for PC retrieval, thereby enhancing the accuracy and robustness of the developed algorithm. The time-series analysis revealed substantial variations in cyanobacterial concentration in the WMS from 2018 to 2022. The highest average concentration occurred in 2019, coinciding with the introduction of diverted Mississippi River water through the Bonnet Carré Spillway, which triggered an unprecedented cyanobacterial bloom. Furthermore, the average PC concentration was consistently higher during the summer months, likely due to elevated air temperatures and increased sunlight promoting cyanobacterial growth. The methodology developed in this study improves the quantitative monitoring of cyanobacterial blooms using UAS imagery and provides valuable insights for future water quality monitoring initiatives in other regions.</div></div>","PeriodicalId":356,"journal":{"name":"Journal of Environmental Management","volume":"381 ","pages":"Article 124878"},"PeriodicalIF":8.4000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Environmental Management","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0301479725008540","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Cyanobacterial blooms pose significant threats to aquatic ecosystems and public health due to their ability to release harmful toxins, degrade water quality, disrupt aquatic habitats, and endanger human and animal health through contact or consumption of contaminated water. Monitoring phycocyanin (PC), a pigment unique to cyanobacteria, offers a reliable method for detecting and quantifying these blooms, enabling timely interventions to mitigate their impacts. This study aimed to evaluate ten machine learning algorithms (MLAs) for assessing the spatiotemporal variations of cyanobacterial concentrations over an oyster reef in the Western Mississippi Sound (WMS) using remotely sensed imagery from uncrewed aircraft systems (UAS) and in-situ PC concentrations measured by an autonomous surface vessel (ASV). The study further investigated the influence of river discharge and climatic variables on cyanobacterial concentrations using a time-series of cyanobacteria maps. To derive the most accurate PC retrieval model, a comprehensive set of 85 features was initially generated, including individual spectral bands, band ratios, multiple vegetation indices, and three-band indices. Feature selection was performed using a two-step approach that combined Sequential Backward Floating Selection (SBFS) and Exhaustive Feature Selection (EFS). SBFS was first used to iteratively remove features and optimize model performance, while EFS evaluated all possible combinations of the features identified by SBFS to select the best subset. Among the ten MLAs tested, Extreme Gradient Boosting emerged as the top-performing model, achieving an R2 of 0.835, a root mean square deviation of 0.419 μg/l, an unbiased mean absolute relative difference of 0.176 μg/l, and an average percentage difference of 18.072 % in retrieving PC concentration. The novelty of this study lies in its data-driven approach to identifying the most suitable machine learning algorithm and feature subsets for PC retrieval, thereby enhancing the accuracy and robustness of the developed algorithm. The time-series analysis revealed substantial variations in cyanobacterial concentration in the WMS from 2018 to 2022. The highest average concentration occurred in 2019, coinciding with the introduction of diverted Mississippi River water through the Bonnet Carré Spillway, which triggered an unprecedented cyanobacterial bloom. Furthermore, the average PC concentration was consistently higher during the summer months, likely due to elevated air temperatures and increased sunlight promoting cyanobacterial growth. The methodology developed in this study improves the quantitative monitoring of cyanobacterial blooms using UAS imagery and provides valuable insights for future water quality monitoring initiatives in other regions.
蓝藻华的时空动态:将机器学习和特征选择技术与无人驾驶飞机系统和自主水面舰艇数据相结合。
蓝藻大量繁殖对水生生态系统和公众健康构成重大威胁,因为它们能够释放有害毒素,降低水质,破坏水生栖息地,并通过接触或饮用受污染的水危及人类和动物的健康。监测藻蓝蛋白(PC),一种蓝藻特有的色素,为检测和量化这些藻华提供了可靠的方法,从而能够及时干预以减轻其影响。本研究旨在评估10种机器学习算法(MLAs),利用无人驾驶飞机系统(UAS)的遥感图像和自主水面舰艇(ASV)测量的原位PC浓度,评估西密西西比湾(WMS)牡蛎礁上蓝藻浓度的时空变化。该研究进一步研究了河流流量和气候变量对蓝藻浓度的影响,使用了时间序列的蓝藻图。为了获得最精确的PC检索模型,首先生成了85个特征,包括单个光谱波段、波段比、多个植被指数和三波段指数。特征选择采用顺序向后浮动选择(SBFS)和穷举特征选择(EFS)相结合的两步方法进行。首先使用SBFS迭代去除特征并优化模型性能,而EFS评估SBFS识别的所有可能的特征组合以选择最佳子集。在10个MLAs模型中,Extreme Gradient Boosting模型表现最佳,R2为0.835,均方根偏差为0.419 μg/l,无偏平均绝对相对差为0.176 μg/l,平均百分比差为18.072%。本研究的新颖之处在于其数据驱动的方法来识别最适合PC检索的机器学习算法和特征子集,从而提高了所开发算法的准确性和鲁棒性。时间序列分析显示,从2018年到2022年,WMS中蓝藻浓度发生了实质性变化。最高的平均浓度发生在2019年,恰逢密西西比河改道通过邦纳卡罗莱泄洪道引入,引发了前所未有的蓝藻水华。此外,在夏季,PC的平均浓度一直较高,可能是由于气温升高和阳光增加促进了蓝藻的生长。本研究中开发的方法改进了使用UAS图像对蓝藻华的定量监测,并为未来其他地区的水质监测举措提供了有价值的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Environmental Management
Journal of Environmental Management 环境科学-环境科学
CiteScore
13.70
自引率
5.70%
发文量
2477
审稿时长
84 days
期刊介绍: The Journal of Environmental Management is a journal for the publication of peer reviewed, original research for all aspects of management and the managed use of the environment, both natural and man-made.Critical review articles are also welcome; submission of these is strongly encouraged.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信