Hoang Vuong Dang , Kermode Stephanie , Peisheng Huang , Cayelan C. Carey , Matthew R. Hipsey
{"title":"Phytoplankton group classification by integrating trait information and observed environmental thresholds","authors":"Hoang Vuong Dang , Kermode Stephanie , Peisheng Huang , Cayelan C. Carey , Matthew R. Hipsey","doi":"10.1016/j.ecoinf.2025.103212","DOIUrl":null,"url":null,"abstract":"<div><div>Assigning phytoplankton taxa into functional groups is a common requirement for process-based models of aquatic ecology, yet it can be challenging in systems with large taxonomic diversity and remains a largely subjective task. In the absence of a clear and transferrable framework, modellers often default to the delineation of phytoplankton groups at the phyla or class level (e.g., diatoms, greens, etc.). However, this approach aggregates the substantial functional and trait diversity that occurs within these groups, creating challenges for model parameterization and assessment. To address this issue, we developed a data-driven approach to define phytoplankton functional groups considering species trait information and occurrence data. The framework calculates the observed environmental thresholds for species monitored in a 12-year dataset from the Hawkesbury-Nepean River (Sydney, Australia), combined with a priori species-level trait information (e.g., organism structure, biovolume, movement types, and nutrient acquisition strategies). We minimized subjectivity in phytoplankton group classification by first applying multiple correlation analysis and principal component analysis to identify the most important environmental factors for threshold analysis. Second, we used Threshold Indicator Taxa Analysis (TITAN) to detect the ecological threshold ranges summarizing species occurrence along environmental gradients of total phosphorus, total nitrogen, the ratio of total nitrogen to total phosphorus, and temperature. Third, we applied K-prototype clustering for group classification based on the identified thresholds and associated traits. Our approach identified five discrete phytoplankton groups with statistically distinct features of environmental preference and morphological and physiological characteristics. The advantage of the method is that the identified groups better reflect the ecological characteristics of the phytoplankton community considering the local environmental requirements, which better aligns with the process parameterizations used in numerical phytoplankton models. This framework can be applied in other aquatic systems as a robust and repeatable way to integrate long-term phytoplankton taxonomic and environmental datasets for water quality analyses.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"90 ","pages":"Article 103212"},"PeriodicalIF":7.3000,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954125002213","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Assigning phytoplankton taxa into functional groups is a common requirement for process-based models of aquatic ecology, yet it can be challenging in systems with large taxonomic diversity and remains a largely subjective task. In the absence of a clear and transferrable framework, modellers often default to the delineation of phytoplankton groups at the phyla or class level (e.g., diatoms, greens, etc.). However, this approach aggregates the substantial functional and trait diversity that occurs within these groups, creating challenges for model parameterization and assessment. To address this issue, we developed a data-driven approach to define phytoplankton functional groups considering species trait information and occurrence data. The framework calculates the observed environmental thresholds for species monitored in a 12-year dataset from the Hawkesbury-Nepean River (Sydney, Australia), combined with a priori species-level trait information (e.g., organism structure, biovolume, movement types, and nutrient acquisition strategies). We minimized subjectivity in phytoplankton group classification by first applying multiple correlation analysis and principal component analysis to identify the most important environmental factors for threshold analysis. Second, we used Threshold Indicator Taxa Analysis (TITAN) to detect the ecological threshold ranges summarizing species occurrence along environmental gradients of total phosphorus, total nitrogen, the ratio of total nitrogen to total phosphorus, and temperature. Third, we applied K-prototype clustering for group classification based on the identified thresholds and associated traits. Our approach identified five discrete phytoplankton groups with statistically distinct features of environmental preference and morphological and physiological characteristics. The advantage of the method is that the identified groups better reflect the ecological characteristics of the phytoplankton community considering the local environmental requirements, which better aligns with the process parameterizations used in numerical phytoplankton models. This framework can be applied in other aquatic systems as a robust and repeatable way to integrate long-term phytoplankton taxonomic and environmental datasets for water quality analyses.
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.