A Hybrid Wheat Head Detection model with Incorporated CNN and Transformer

Shou Harada, Xian-Hua Han
{"title":"A Hybrid Wheat Head Detection model with Incorporated CNN and Transformer","authors":"Shou Harada, Xian-Hua Han","doi":"10.23919/MVA57639.2023.10216087","DOIUrl":null,"url":null,"abstract":"Wheat head detection is an important research topic for production estimation and growth management. Motivated by the great advantages of the deep convolution neural networks (DCNNs) in many vision tasks, the deep-learning based methods have dominated the wheat head detection field, and manifest remarkable performance improvement compared with the traditional image processing methods. The existing methods usually divert the proposed detection models for the generic object detection to wheat head detection, and are insufficient in taking account of the specific characteristics of the wheat head images such as large variations due to different growth stages, high density and overlaps. This work exploits a novel hybrid wheat detection model by incorporating the CNN and transformer for modeling long-range dependence. Specifically, we firstly employ a backbone ResNet to extract multi-scale features, and leverage an inter-scale feature fusion module to aggregate coarse-to-fine features together for capturing sufficient spatial detail to localize small-size wheat head. Moreover, we propose a novel and efficient transformer block by incorporating the self-attention module in channel direction and the feature feed-forward subnet to explore the interaction among the aggregated multi-scale features. Finally a prediction head produces the centerness and size of wheat heads to obtain a simple anchor-free detection model. Extensive experiments on the Global Wheat Head Detection (GWHD) dataset have demonstrated the superiority of our proposed model over the existing state-of-the-art methods as well as the baseline model.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 18th International Conference on Machine Vision and Applications (MVA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/MVA57639.2023.10216087","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Wheat head detection is an important research topic for production estimation and growth management. Motivated by the great advantages of the deep convolution neural networks (DCNNs) in many vision tasks, the deep-learning based methods have dominated the wheat head detection field, and manifest remarkable performance improvement compared with the traditional image processing methods. The existing methods usually divert the proposed detection models for the generic object detection to wheat head detection, and are insufficient in taking account of the specific characteristics of the wheat head images such as large variations due to different growth stages, high density and overlaps. This work exploits a novel hybrid wheat detection model by incorporating the CNN and transformer for modeling long-range dependence. Specifically, we firstly employ a backbone ResNet to extract multi-scale features, and leverage an inter-scale feature fusion module to aggregate coarse-to-fine features together for capturing sufficient spatial detail to localize small-size wheat head. Moreover, we propose a novel and efficient transformer block by incorporating the self-attention module in channel direction and the feature feed-forward subnet to explore the interaction among the aggregated multi-scale features. Finally a prediction head produces the centerness and size of wheat heads to obtain a simple anchor-free detection model. Extensive experiments on the Global Wheat Head Detection (GWHD) dataset have demonstrated the superiority of our proposed model over the existing state-of-the-art methods as well as the baseline model.
结合CNN和变压器的杂交小麦抽穗检测模型
小麦抽穗检测是小麦产量估算和生长管理的重要研究课题。由于深度卷积神经网络(deep convolution neural networks, DCNNs)在许多视觉任务中的巨大优势,基于深度学习的方法在麦穗检测领域占据主导地位,与传统的图像处理方法相比,具有显著的性能提升。现有方法通常将所提出的一般目标检测模型转移到麦穗检测上,不足以考虑到麦穗图像不同生长阶段变化大、密度大、重叠等具体特征。本文利用CNN和变压器对小麦的远程依赖关系进行建模,建立了一种新的杂交小麦检测模型。具体而言,我们首先利用骨干ResNet提取多尺度特征,并利用尺度间特征融合模块将粗到细的特征聚合在一起,以捕获足够的空间细节来定位小尺寸麦穗。此外,我们提出了一种新颖高效的变压器模块,将通道方向上的自关注模块与特征前馈子网相结合,探索聚合的多尺度特征之间的相互作用。最后由预测头生成麦穗的中心度和大小,得到一个简单的无锚检测模型。在全球小麦穗检测(GWHD)数据集上进行的大量实验表明,我们提出的模型优于现有的最先进的方法以及基线模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信