植物标本照片图像分类模型训练库的高效生成

IF 1.5 3区 生物学 Q3 PLANT SCIENCES
A. Schmidt‐Lebuhn, Nunzio J. Knerr
{"title":"植物标本照片图像分类模型训练库的高效生成","authors":"A. Schmidt‐Lebuhn, Nunzio J. Knerr","doi":"10.1086/724950","DOIUrl":null,"url":null,"abstract":"Premise of research. Computer vision has the potential to become a transformative identification tool in biodiversity research and collections management, allowing high-throughput identification and removing the need for nonexpert end users to understand technical terminology. A major bottleneck for taxonomists is the generation of sufficient numbers of training images. Contemporary large-scale imaging projects of herbaria provide an increasing number of specimen photos, but whole-sheet images are not directly suitable for training image classification models targeted at individual taxonomically informative characters. Methodology. Here, we illustrate a time- and labor-efficient approach for generating training libraries for image classification from photos of herbarium sheets. It involves the annotation of specimen images with bounding boxes using open-source software and automated cropping of annotations with a custom script to produce the training library. We demonstrate the approach on the flower heads of a genus of Asteraceae comprising eight taxa, six species and two nontypus varieties. Pivotal results. After generating 816 training images from 33 specimen photos with a time investment of only ∼90 min, we trained an image classification model that achieved 98.2% precision and recall. Conclusions. The demonstrated approach allows taxonomists to use digitized herbarium specimens to produce training libraries for image classification models within hours. We expect that computer vision will increasingly become a part of taxonomic practice.","PeriodicalId":14306,"journal":{"name":"INTERNATIONAL JOURNAL OF PLANT SCIENCES","volume":"98 1","pages":"387 - 391"},"PeriodicalIF":1.5000,"publicationDate":"2023-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Generation of Training Libraries for Image Classification Models from Photos of Herbarium Specimens\",\"authors\":\"A. Schmidt‐Lebuhn, Nunzio J. Knerr\",\"doi\":\"10.1086/724950\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Premise of research. Computer vision has the potential to become a transformative identification tool in biodiversity research and collections management, allowing high-throughput identification and removing the need for nonexpert end users to understand technical terminology. A major bottleneck for taxonomists is the generation of sufficient numbers of training images. Contemporary large-scale imaging projects of herbaria provide an increasing number of specimen photos, but whole-sheet images are not directly suitable for training image classification models targeted at individual taxonomically informative characters. Methodology. Here, we illustrate a time- and labor-efficient approach for generating training libraries for image classification from photos of herbarium sheets. It involves the annotation of specimen images with bounding boxes using open-source software and automated cropping of annotations with a custom script to produce the training library. We demonstrate the approach on the flower heads of a genus of Asteraceae comprising eight taxa, six species and two nontypus varieties. Pivotal results. After generating 816 training images from 33 specimen photos with a time investment of only ∼90 min, we trained an image classification model that achieved 98.2% precision and recall. Conclusions. The demonstrated approach allows taxonomists to use digitized herbarium specimens to produce training libraries for image classification models within hours. We expect that computer vision will increasingly become a part of taxonomic practice.\",\"PeriodicalId\":14306,\"journal\":{\"name\":\"INTERNATIONAL JOURNAL OF PLANT SCIENCES\",\"volume\":\"98 1\",\"pages\":\"387 - 391\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2023-03-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"INTERNATIONAL JOURNAL OF PLANT SCIENCES\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1086/724950\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"PLANT SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"INTERNATIONAL JOURNAL OF PLANT SCIENCES","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1086/724950","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PLANT SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

研究的前提。计算机视觉有可能成为生物多样性研究和藏品管理中的变革性识别工具,允许高通量识别,并且消除了非专业最终用户理解技术术语的需要。分类学家的一个主要瓶颈是生成足够数量的训练图像。当代植物标本馆的大规模成像项目提供了越来越多的标本照片,但整片图像并不直接适用于针对个体分类信息特征的图像分类模型的训练。方法。在这里,我们演示了一种既省时又省力的方法,用于从植物标本馆的照片中生成图像分类的训练库。它涉及使用开源软件对带有边界框的样本图像进行注释,并使用自定义脚本自动裁剪注释以生成训练库。我们在由8个分类群,6个种和2个非典型变种组成的菊科一个属的花头上展示了这种方法。关键的结果。从33张样本照片中生成816张训练图像,时间投入仅为~ 90分钟,我们训练的图像分类模型达到了98.2%的准确率和召回率。结论。所演示的方法允许分类学家使用数字化植物标本在数小时内生成图像分类模型的训练库。我们期望计算机视觉将越来越多地成为分类学实践的一部分。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Efficient Generation of Training Libraries for Image Classification Models from Photos of Herbarium Specimens
Premise of research. Computer vision has the potential to become a transformative identification tool in biodiversity research and collections management, allowing high-throughput identification and removing the need for nonexpert end users to understand technical terminology. A major bottleneck for taxonomists is the generation of sufficient numbers of training images. Contemporary large-scale imaging projects of herbaria provide an increasing number of specimen photos, but whole-sheet images are not directly suitable for training image classification models targeted at individual taxonomically informative characters. Methodology. Here, we illustrate a time- and labor-efficient approach for generating training libraries for image classification from photos of herbarium sheets. It involves the annotation of specimen images with bounding boxes using open-source software and automated cropping of annotations with a custom script to produce the training library. We demonstrate the approach on the flower heads of a genus of Asteraceae comprising eight taxa, six species and two nontypus varieties. Pivotal results. After generating 816 training images from 33 specimen photos with a time investment of only ∼90 min, we trained an image classification model that achieved 98.2% precision and recall. Conclusions. The demonstrated approach allows taxonomists to use digitized herbarium specimens to produce training libraries for image classification models within hours. We expect that computer vision will increasingly become a part of taxonomic practice.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.50
自引率
4.30%
发文量
65
审稿时长
6-12 weeks
期刊介绍: The International Journal of Plant Sciences has a distinguished history of publishing research in the plant sciences since 1875. IJPS presents high quality, original, peer-reviewed research from laboratories around the world in all areas of the plant sciences. Topics covered range from genetics and genomics, developmental and cell biology, biochemistry and physiology, to morphology and anatomy, systematics, evolution, paleobotany, plant-microbe interactions, and ecology. IJPS does NOT publish papers on agriculture or crop improvement. In addition to full-length research papers, IJPS publishes review articles, including the open access Coulter Reviews, rapid communications, and perspectives. IJPS welcomes contributions that present evaluations and new perspectives on areas of current interest in plant biology. IJPS publishes nine issues per year and regularly features special issues on topics of particular interest, including new and exciting research originally presented at major botanical conferences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信