Tomke S Wacker, Abraham G Smith, Signe M Jensen, Theresa Pflüger, Viktor G Hertz, Eva Rosenqvist, Fulai Liu, Dorte B Dresbøll
{"title":"Stomata morphology measurement with interactive machine learning: accuracy, speed, and biological relevance?","authors":"Tomke S Wacker, Abraham G Smith, Signe M Jensen, Theresa Pflüger, Viktor G Hertz, Eva Rosenqvist, Fulai Liu, Dorte B Dresbøll","doi":"10.1186/s13007-025-01416-2","DOIUrl":null,"url":null,"abstract":"<p><p>Stomatal morphology plays a critical role in regulating plant gas exchange influencing water use efficiency and ecological adaptability. While traditional methods for analyzing stomatal traits rely on labor-intensive manual measurements, machine learning (ML) tools offer a promising alternative. In this study, we evaluate the suitability of a U-Net-based interactive ML software with corrective annotation for stomatal morphology phenotyping. The approach enables non-ML experts to efficiently segment stomatal structures across diverse datasets, including images from different plant species, magnifications, and imprint methods. We trained a single model based on images from five datasets and tested its performance on unseen data, achieving high accuracy for stomatal density (R<sup>2</sup> = 0.98) and size (R<sup>2</sup> = 0.90). Thresholding approaches applied to the U-Net segmentations further improved accuracy, particularly for density measurements. Despite significant variability between datasets, our findings demonstrate the feasibility of training a single segmentation model to analyze diverse stomatal data sets. Validation approaches showed that a semi-automatic approach involving correcting segmentations was five times faster than manual annotation while maintaining comparable accuracy. Our results also illustrate that ML metrics, such as the F1 score, correlate with accuracy in the statistical analysis of trait measurements with improvements diminishing after 2:30 h model training. The final model achieved high precision, allowing the detection of highly significant biological differences in stomatal morphology within plant, between genotypes and across growing environments. This study highlights interactive ML with corrective annotation as a robust and accessible tool for accelerating phenotyping in plant sciences, reducing technical barriers and promoting high-throughput analysis.</p>","PeriodicalId":20100,"journal":{"name":"Plant Methods","volume":"21 1","pages":"95"},"PeriodicalIF":4.7000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12243431/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Methods","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13007-025-01416-2","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Stomatal morphology plays a critical role in regulating plant gas exchange influencing water use efficiency and ecological adaptability. While traditional methods for analyzing stomatal traits rely on labor-intensive manual measurements, machine learning (ML) tools offer a promising alternative. In this study, we evaluate the suitability of a U-Net-based interactive ML software with corrective annotation for stomatal morphology phenotyping. The approach enables non-ML experts to efficiently segment stomatal structures across diverse datasets, including images from different plant species, magnifications, and imprint methods. We trained a single model based on images from five datasets and tested its performance on unseen data, achieving high accuracy for stomatal density (R2 = 0.98) and size (R2 = 0.90). Thresholding approaches applied to the U-Net segmentations further improved accuracy, particularly for density measurements. Despite significant variability between datasets, our findings demonstrate the feasibility of training a single segmentation model to analyze diverse stomatal data sets. Validation approaches showed that a semi-automatic approach involving correcting segmentations was five times faster than manual annotation while maintaining comparable accuracy. Our results also illustrate that ML metrics, such as the F1 score, correlate with accuracy in the statistical analysis of trait measurements with improvements diminishing after 2:30 h model training. The final model achieved high precision, allowing the detection of highly significant biological differences in stomatal morphology within plant, between genotypes and across growing environments. This study highlights interactive ML with corrective annotation as a robust and accessible tool for accelerating phenotyping in plant sciences, reducing technical barriers and promoting high-throughput analysis.
期刊介绍:
Plant Methods is an open access, peer-reviewed, online journal for the plant research community that encompasses all aspects of technological innovation in the plant sciences.
There is no doubt that we have entered an exciting new era in plant biology. The completion of the Arabidopsis genome sequence, and the rapid progress being made in other plant genomics projects are providing unparalleled opportunities for progress in all areas of plant science. Nevertheless, enormous challenges lie ahead if we are to understand the function of every gene in the genome, and how the individual parts work together to make the whole organism. Achieving these goals will require an unprecedented collaborative effort, combining high-throughput, system-wide technologies with more focused approaches that integrate traditional disciplines such as cell biology, biochemistry and molecular genetics.
Technological innovation is probably the most important catalyst for progress in any scientific discipline. Plant Methods’ goal is to stimulate the development and adoption of new and improved techniques and research tools and, where appropriate, to promote consistency of methodologies for better integration of data from different laboratories.