{"title":"SALMA: A machine learning tool for precise leaf morphology measurements","authors":"Ilya Shabanov , Julie R Deslippe , Andrew Lensen","doi":"10.1016/j.ecoinf.2025.103592","DOIUrl":null,"url":null,"abstract":"<div><div>Leaf area is a critical plant functional trait, widely used for understanding plant responses to climate change, ecosystem productivity, and species' adaptive strategies. Inaccurate leaf area measurements compromise the accuracy of other traits normalised by area, such as foliar chemical traits, respiration, and photosynthesis. However, existing measurement methods are ineffective for small-leaved plants and often necessitate manual processing, which limits sample sizes and potentially obscures subtle trait-environment relationships. We developed SALMA (Semi-Automated Leaf Morphological Analysis), which employs logistic regression trained on one to four human-generated examples per species to delineate leaf boundaries for that species accurately. SALMA's training step adapts to species-specific features by integrating multiple characteristics, such as colour variations and edge details. The approach is validated on an extensive dataset (64 species, 3332 images) that covers 91.4 % of the worldwide leaf area variation, as well as two smaller datasets comprising low-quality photographs of morphologically complex or damaged leaves. SALMA consistently achieved leaf area errors 2 to 15 times lower than existing algorithms and a theoretical upper bound of any grayscale intensity-based method. Critically, we identify a previously overlooked power-law relationship between leaf area and measurement error, demonstrating that existing methods may overestimate leaf area by at least 5 % for 43 % of global species, whereas SALMA achieves comparable errors for only 2.1 % of species. SALMA is a standalone software with an intuitive interface that supports parallel processing, making it accessible for large-scale ecological studies globally. It can potentially enhance the quality of trait datasets and facilitate large-scale sampling, thereby advancing our understanding of plant-environment interactions. Our published dataset contains manually created binary segmentations of leaves and background, providing a baseline for future leaf measurement algorithms.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"93 ","pages":"Article 103592"},"PeriodicalIF":7.3000,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954125006016","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/1/2 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Leaf area is a critical plant functional trait, widely used for understanding plant responses to climate change, ecosystem productivity, and species' adaptive strategies. Inaccurate leaf area measurements compromise the accuracy of other traits normalised by area, such as foliar chemical traits, respiration, and photosynthesis. However, existing measurement methods are ineffective for small-leaved plants and often necessitate manual processing, which limits sample sizes and potentially obscures subtle trait-environment relationships. We developed SALMA (Semi-Automated Leaf Morphological Analysis), which employs logistic regression trained on one to four human-generated examples per species to delineate leaf boundaries for that species accurately. SALMA's training step adapts to species-specific features by integrating multiple characteristics, such as colour variations and edge details. The approach is validated on an extensive dataset (64 species, 3332 images) that covers 91.4 % of the worldwide leaf area variation, as well as two smaller datasets comprising low-quality photographs of morphologically complex or damaged leaves. SALMA consistently achieved leaf area errors 2 to 15 times lower than existing algorithms and a theoretical upper bound of any grayscale intensity-based method. Critically, we identify a previously overlooked power-law relationship between leaf area and measurement error, demonstrating that existing methods may overestimate leaf area by at least 5 % for 43 % of global species, whereas SALMA achieves comparable errors for only 2.1 % of species. SALMA is a standalone software with an intuitive interface that supports parallel processing, making it accessible for large-scale ecological studies globally. It can potentially enhance the quality of trait datasets and facilitate large-scale sampling, thereby advancing our understanding of plant-environment interactions. Our published dataset contains manually created binary segmentations of leaves and background, providing a baseline for future leaf measurement algorithms.
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.