{"title":"Learning the syntax of plant assemblages.","authors":"César Leblanc,Pierre Bonnet,Maximilien Servajean,Wilfried Thuiller,Milan Chytrý,Svetlana Aćić,Olivier Argagnon,Idoia Biurrun,Gianmaria Bonari,Helge Bruelheide,Juan Antonio Campos,Andraž Čarni,Renata Ćušterevska,Michele De Sanctis,Jürgen Dengler,Tetiana Dziuba,Emmanuel Garbolino,Ute Jandt,Florian Jansen,Jonathan Lenoir,Jesper Erenskjold Moeslund,Aaron Pérez-Haase,Remigiusz Pielech,Jozef Sibik,Zvjezdana Stančić,Domas Uogintas,Thomas Wohlgemuth,Alexis Joly","doi":"10.1038/s41477-025-02105-7","DOIUrl":null,"url":null,"abstract":"To address the urgent biodiversity crisis, it is crucial to understand the nature of plant assemblages. The distribution of plant species is shaped not only by their broad environmental requirements but also by micro-environmental conditions, dispersal limitations, and direct and indirect species interactions. While predicting species composition and habitat type is essential for conservation and restoration purposes, it remains challenging. In this study, we propose an approach inspired by advances in large language models to learn the 'syntax' of abundance-ordered plant species sequences in communities. Our method, which captures latent associations between species across diverse ecosystems, can be fine-tuned for diverse tasks. In particular, we show that our methodology is able to outperform other approaches to (1) predict species that might occur in an assemblage given the other listed species, despite being originally missing in the species list (16.53% higher accuracy in retrieving a plant species removed from an assemblage than co-occurrence matrices and 6.56% higher than neural networks), and (2) classify habitat types from species assemblages (5.54% higher accuracy in assigning a habitat type to an assemblage than expert system classifiers and 1.14% higher than tabular deep learning). The proposed application has a vocabulary that covers over 10,000 plant species from Europe and adjacent countries and provides a powerful methodology for improving biodiversity mapping, restoration and conservation biology. As ecologists begin to explore the use of artificial intelligence, such approaches open opportunities for rethinking how we model, monitor and understand nature.","PeriodicalId":18904,"journal":{"name":"Nature Plants","volume":"122 1","pages":""},"PeriodicalIF":13.6000,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Plants","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1038/s41477-025-02105-7","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PLANT SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
To address the urgent biodiversity crisis, it is crucial to understand the nature of plant assemblages. The distribution of plant species is shaped not only by their broad environmental requirements but also by micro-environmental conditions, dispersal limitations, and direct and indirect species interactions. While predicting species composition and habitat type is essential for conservation and restoration purposes, it remains challenging. In this study, we propose an approach inspired by advances in large language models to learn the 'syntax' of abundance-ordered plant species sequences in communities. Our method, which captures latent associations between species across diverse ecosystems, can be fine-tuned for diverse tasks. In particular, we show that our methodology is able to outperform other approaches to (1) predict species that might occur in an assemblage given the other listed species, despite being originally missing in the species list (16.53% higher accuracy in retrieving a plant species removed from an assemblage than co-occurrence matrices and 6.56% higher than neural networks), and (2) classify habitat types from species assemblages (5.54% higher accuracy in assigning a habitat type to an assemblage than expert system classifiers and 1.14% higher than tabular deep learning). The proposed application has a vocabulary that covers over 10,000 plant species from Europe and adjacent countries and provides a powerful methodology for improving biodiversity mapping, restoration and conservation biology. As ecologists begin to explore the use of artificial intelligence, such approaches open opportunities for rethinking how we model, monitor and understand nature.
期刊介绍:
Nature Plants is an online-only, monthly journal publishing the best research on plants — from their evolution, development, metabolism and environmental interactions to their societal significance.