CETR:用于麦头检测的中心网-视觉转换器模型

K. G. Suma, Gurram Sunitha, Ramesh Karnati, E. R. Aruna, Kachi Anvesh, Navnath Kale, P. Krishna Kishore
{"title":"CETR:用于麦头检测的中心网-视觉转换器模型","authors":"K. G. Suma, Gurram Sunitha, Ramesh Karnati, E. R. Aruna, Kachi Anvesh, Navnath Kale, P. Krishna Kishore","doi":"10.32629/jai.v7i3.1189","DOIUrl":null,"url":null,"abstract":"Wheat head detection is a critical task in precision agriculture for estimating crop yield and optimizing agricultural practices. Conventional object detection architectures often struggle with detecting densely packed and overlapping wheat heads in complex agricultural field images. To address this challenge, a novel CEnternet-vision TRansformer model for Wheat Head Detection (CETR) is proposed. CETR model combines the strengths of two cutting-edge technologies—CenterNet and Vision Transformer. A dataset of agricultural farm images labeled with precise wheat head annotations is used to train and evaluate the CETR model. Comprehensive experiments were conducted to compare CETR’s performance against convolutional neural network model commonly used in agricultural applications. The higher mAP value of 0.8318 for CETR compared against AlexNet, VGG19, ResNet152 and MobileNet indicates that the CETR model is more effective in detecting wheat heads in agricultural images. It achieves a higher precision in predicting bounding boxes that align well with the ground truth, resulting in more accurate and reliable wheat head detection. The higher performance of CETR can be attributed to the combination of CenterNet and ViT as a two-stage architecture taking advantage of both methods. Moreover, the transformer-based architecture of CETR enables better generalization across different agricultural environments, making it a suitable solution for automated agricultural applications.","PeriodicalId":307060,"journal":{"name":"Journal of Autonomous Intelligence","volume":"109 8","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CETR: CenterNet-Vision transformer model for wheat head detection\",\"authors\":\"K. G. Suma, Gurram Sunitha, Ramesh Karnati, E. R. Aruna, Kachi Anvesh, Navnath Kale, P. Krishna Kishore\",\"doi\":\"10.32629/jai.v7i3.1189\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Wheat head detection is a critical task in precision agriculture for estimating crop yield and optimizing agricultural practices. Conventional object detection architectures often struggle with detecting densely packed and overlapping wheat heads in complex agricultural field images. To address this challenge, a novel CEnternet-vision TRansformer model for Wheat Head Detection (CETR) is proposed. CETR model combines the strengths of two cutting-edge technologies—CenterNet and Vision Transformer. A dataset of agricultural farm images labeled with precise wheat head annotations is used to train and evaluate the CETR model. Comprehensive experiments were conducted to compare CETR’s performance against convolutional neural network model commonly used in agricultural applications. The higher mAP value of 0.8318 for CETR compared against AlexNet, VGG19, ResNet152 and MobileNet indicates that the CETR model is more effective in detecting wheat heads in agricultural images. It achieves a higher precision in predicting bounding boxes that align well with the ground truth, resulting in more accurate and reliable wheat head detection. The higher performance of CETR can be attributed to the combination of CenterNet and ViT as a two-stage architecture taking advantage of both methods. Moreover, the transformer-based architecture of CETR enables better generalization across different agricultural environments, making it a suitable solution for automated agricultural applications.\",\"PeriodicalId\":307060,\"journal\":{\"name\":\"Journal of Autonomous Intelligence\",\"volume\":\"109 8\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Autonomous Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32629/jai.v7i3.1189\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Autonomous Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32629/jai.v7i3.1189","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

麦头检测是精准农业中估算作物产量和优化农业实践的一项关键任务。传统的物体检测架构往往难以检测复杂农田图像中密集重叠的麦头。为了应对这一挑战,我们提出了一种用于麦头检测的新型 CEnternet-vision TRansformer 模型(CETR)。CETR 模型结合了两种尖端技术--CenterNet 和 Vision Transformer 的优势。CETR 模型的训练和评估使用了一个标有精确麦头注释的农业农场图像数据集。通过综合实验,比较了 CETR 与农业应用中常用的卷积神经网络模型的性能。与 AlexNet、VGG19、ResNet152 和 MobileNet 相比,CETR 的 mAP 值高达 0.8318,这表明 CETR 模型在检测农业图像中的小麦头方面更为有效。它能更精确地预测与地面实况完全一致的边界框,从而实现更准确、更可靠的麦头检测。CETR 的较高性能可归因于将 CenterNet 和 ViT 结合为一个两阶段架构,充分利用了这两种方法的优势。此外,CETR 基于变压器的架构能够在不同的农业环境中实现更好的通用性,使其成为适合自动化农业应用的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
CETR: CenterNet-Vision transformer model for wheat head detection
Wheat head detection is a critical task in precision agriculture for estimating crop yield and optimizing agricultural practices. Conventional object detection architectures often struggle with detecting densely packed and overlapping wheat heads in complex agricultural field images. To address this challenge, a novel CEnternet-vision TRansformer model for Wheat Head Detection (CETR) is proposed. CETR model combines the strengths of two cutting-edge technologies—CenterNet and Vision Transformer. A dataset of agricultural farm images labeled with precise wheat head annotations is used to train and evaluate the CETR model. Comprehensive experiments were conducted to compare CETR’s performance against convolutional neural network model commonly used in agricultural applications. The higher mAP value of 0.8318 for CETR compared against AlexNet, VGG19, ResNet152 and MobileNet indicates that the CETR model is more effective in detecting wheat heads in agricultural images. It achieves a higher precision in predicting bounding boxes that align well with the ground truth, resulting in more accurate and reliable wheat head detection. The higher performance of CETR can be attributed to the combination of CenterNet and ViT as a two-stage architecture taking advantage of both methods. Moreover, the transformer-based architecture of CETR enables better generalization across different agricultural environments, making it a suitable solution for automated agricultural applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信