Erik Andvaag, Kaylie Krys, Steven J Shirtliffe, Ian Stavness
{"title":"计算油菜籽:建立可通用的空中植物探测模型","authors":"Erik Andvaag, Kaylie Krys, Steven J Shirtliffe, Ian Stavness","doi":"10.34133/plantphenomics.0268","DOIUrl":null,"url":null,"abstract":"<p><p>Plant population counts are highly valued by crop producers as important early-season indicators of field health. Traditionally, emergence rate estimates have been acquired through manual counting, an approach that is labor-intensive and relies heavily on sampling techniques. By applying deep learning-based object detection models to aerial field imagery, accurate plant population counts can be obtained for much larger areas of a field. Unfortunately, current detection models often perform poorly when they are faced with image conditions that do not closely resemble the data found in their training sets. In this paper, we explore how specific facets of a plant detector's training set can affect its ability to generalize to unseen image sets. In particular, we examine how a plant detection model's generalizability is influenced by the size, diversity, and quality of its training data. Our experiments show that the gap between in-distribution and out-of-distribution performance cannot be closed by merely increasing the size of a model's training set. We also demonstrate the importance of training set diversity in producing generalizable models, and show how different types of annotation noise can elicit different model behaviors in out-of-distribution test sets. We conduct our investigations with a large and diverse dataset of canola field imagery that we assembled over several years. We also present a new web tool, Canola Counter, which is specifically designed for remote-sensed aerial plant detection tasks. We use the Canola Counter tool to prepare our annotated canola seedling dataset and conduct our experiments. Both our dataset and web tool are publicly available.</p>","PeriodicalId":20318,"journal":{"name":"Plant Phenomics","volume":"6 ","pages":"0268"},"PeriodicalIF":7.6000,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11543947/pdf/","citationCount":"0","resultStr":"{\"title\":\"Counting Canola: Toward Generalizable Aerial Plant Detection Models.\",\"authors\":\"Erik Andvaag, Kaylie Krys, Steven J Shirtliffe, Ian Stavness\",\"doi\":\"10.34133/plantphenomics.0268\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Plant population counts are highly valued by crop producers as important early-season indicators of field health. Traditionally, emergence rate estimates have been acquired through manual counting, an approach that is labor-intensive and relies heavily on sampling techniques. By applying deep learning-based object detection models to aerial field imagery, accurate plant population counts can be obtained for much larger areas of a field. Unfortunately, current detection models often perform poorly when they are faced with image conditions that do not closely resemble the data found in their training sets. In this paper, we explore how specific facets of a plant detector's training set can affect its ability to generalize to unseen image sets. In particular, we examine how a plant detection model's generalizability is influenced by the size, diversity, and quality of its training data. Our experiments show that the gap between in-distribution and out-of-distribution performance cannot be closed by merely increasing the size of a model's training set. We also demonstrate the importance of training set diversity in producing generalizable models, and show how different types of annotation noise can elicit different model behaviors in out-of-distribution test sets. We conduct our investigations with a large and diverse dataset of canola field imagery that we assembled over several years. We also present a new web tool, Canola Counter, which is specifically designed for remote-sensed aerial plant detection tasks. We use the Canola Counter tool to prepare our annotated canola seedling dataset and conduct our experiments. Both our dataset and web tool are publicly available.</p>\",\"PeriodicalId\":20318,\"journal\":{\"name\":\"Plant Phenomics\",\"volume\":\"6 \",\"pages\":\"0268\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2024-11-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11543947/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Plant Phenomics\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://doi.org/10.34133/plantphenomics.0268\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"AGRONOMY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Phenomics","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.34133/plantphenomics.0268","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
Plant population counts are highly valued by crop producers as important early-season indicators of field health. Traditionally, emergence rate estimates have been acquired through manual counting, an approach that is labor-intensive and relies heavily on sampling techniques. By applying deep learning-based object detection models to aerial field imagery, accurate plant population counts can be obtained for much larger areas of a field. Unfortunately, current detection models often perform poorly when they are faced with image conditions that do not closely resemble the data found in their training sets. In this paper, we explore how specific facets of a plant detector's training set can affect its ability to generalize to unseen image sets. In particular, we examine how a plant detection model's generalizability is influenced by the size, diversity, and quality of its training data. Our experiments show that the gap between in-distribution and out-of-distribution performance cannot be closed by merely increasing the size of a model's training set. We also demonstrate the importance of training set diversity in producing generalizable models, and show how different types of annotation noise can elicit different model behaviors in out-of-distribution test sets. We conduct our investigations with a large and diverse dataset of canola field imagery that we assembled over several years. We also present a new web tool, Canola Counter, which is specifically designed for remote-sensed aerial plant detection tasks. We use the Canola Counter tool to prepare our annotated canola seedling dataset and conduct our experiments. Both our dataset and web tool are publicly available.
期刊介绍:
Plant Phenomics is an Open Access journal published in affiliation with the State Key Laboratory of Crop Genetics & Germplasm Enhancement, Nanjing Agricultural University (NAU) and published by the American Association for the Advancement of Science (AAAS). Like all partners participating in the Science Partner Journal program, Plant Phenomics is editorially independent from the Science family of journals.
The mission of Plant Phenomics is to publish novel research that will advance all aspects of plant phenotyping from the cell to the plant population levels using innovative combinations of sensor systems and data analytics. Plant Phenomics aims also to connect phenomics to other science domains, such as genomics, genetics, physiology, molecular biology, bioinformatics, statistics, mathematics, and computer sciences. Plant Phenomics should thus contribute to advance plant sciences and agriculture/forestry/horticulture by addressing key scientific challenges in the area of plant phenomics.
The scope of the journal covers the latest technologies in plant phenotyping for data acquisition, data management, data interpretation, modeling, and their practical applications for crop cultivation, plant breeding, forestry, horticulture, ecology, and other plant-related domains.