MSNet: A multispectral-image driven rapeseed canopy instance segmentation network

IF 8.2 Q1 AGRICULTURE, MULTIDISCIPLINARY
Yuang Yang, Xiaole Wang, Fugui Zhang, Zhenchao Wu, Yu Wang, Yujie Liu, Xuan Lv, Bowen Luo, Liqing Chen, Yang Yang
{"title":"MSNet: A multispectral-image driven rapeseed canopy instance segmentation network","authors":"Yuang Yang,&nbsp;Xiaole Wang,&nbsp;Fugui Zhang,&nbsp;Zhenchao Wu,&nbsp;Yu Wang,&nbsp;Yujie Liu,&nbsp;Xuan Lv,&nbsp;Bowen Luo,&nbsp;Liqing Chen,&nbsp;Yang Yang","doi":"10.1016/j.aiia.2025.05.008","DOIUrl":null,"url":null,"abstract":"<div><div>Precise detection of rapeseed and the growth of its canopy area are crucial phenotypic indicators of its growth status. Achieving accurate identification of the rapeseed target and its growth region provides significant data support for phenotypic analysis and breeding research. However, in natural field environments, rapeseed detection remains a substantial challenge due to the limited feature representation capabilities of RGB-only modalities. To address this challenge, this study proposes a dual-modal instance segmentation network, MSNet, based on YOLOv11n-seg, integrating both RGB and Near-Infrared (NIR) modalities. The main improvements of this network include three different fusion location strategies (frontend fusion, mid-stage fusion, and backend fusion) and the newly introduced Hierarchical Attention Fusion Block (HAFB) for multimodal feature fusion. Comparative experiments on fusion locations indicate that the mid-stage fusion strategy achieves the best balance between detection accuracy and parameter efficiency. Compared to the baseline network, the <em>mAP50:95</em> improvement can reach up to 3.5 %. After introducing the HAFB module, the MSNet-H-HAFB model demonstrates a 6.5 % increase in <em>mAP50:95</em> relative to the baseline network, with less than a 38 % increase in parameter count. It is noteworthy that the mid-stage fusion consistently delivered the best detection performance in all experiments, providing clear design guidance for selecting fusion locations in future multimodal networks. In addition, comparisons with various RGB-only instance segmentation models show that all the proposed MSNet-HAFB fusion models significantly outperform single-modal models in rapeseed count detection tasks, confirming the potential advantages of multispectral fusion strategies in agricultural target recognition. Finally, the MSNet was applied in an agricultural case study, including vegetation index level analysis and frost damage classification. The results show that ZN6–2836 and ZS11 were predicted as potential superior varieties, and the EVI2 vegetation index achieved the best performance in rapeseed frost damage classification.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"15 4","pages":"Pages 642-658"},"PeriodicalIF":8.2000,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Agriculture","FirstCategoryId":"1087","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589721725000637","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Precise detection of rapeseed and the growth of its canopy area are crucial phenotypic indicators of its growth status. Achieving accurate identification of the rapeseed target and its growth region provides significant data support for phenotypic analysis and breeding research. However, in natural field environments, rapeseed detection remains a substantial challenge due to the limited feature representation capabilities of RGB-only modalities. To address this challenge, this study proposes a dual-modal instance segmentation network, MSNet, based on YOLOv11n-seg, integrating both RGB and Near-Infrared (NIR) modalities. The main improvements of this network include three different fusion location strategies (frontend fusion, mid-stage fusion, and backend fusion) and the newly introduced Hierarchical Attention Fusion Block (HAFB) for multimodal feature fusion. Comparative experiments on fusion locations indicate that the mid-stage fusion strategy achieves the best balance between detection accuracy and parameter efficiency. Compared to the baseline network, the mAP50:95 improvement can reach up to 3.5 %. After introducing the HAFB module, the MSNet-H-HAFB model demonstrates a 6.5 % increase in mAP50:95 relative to the baseline network, with less than a 38 % increase in parameter count. It is noteworthy that the mid-stage fusion consistently delivered the best detection performance in all experiments, providing clear design guidance for selecting fusion locations in future multimodal networks. In addition, comparisons with various RGB-only instance segmentation models show that all the proposed MSNet-HAFB fusion models significantly outperform single-modal models in rapeseed count detection tasks, confirming the potential advantages of multispectral fusion strategies in agricultural target recognition. Finally, the MSNet was applied in an agricultural case study, including vegetation index level analysis and frost damage classification. The results show that ZN6–2836 and ZS11 were predicted as potential superior varieties, and the EVI2 vegetation index achieved the best performance in rapeseed frost damage classification.
一个多光谱图像驱动的油菜籽冠层实例分割网络
油菜籽的生长状况及其冠层面积的精确检测是反映油菜籽生长状况的重要表型指标。实现油菜靶点及其生长区域的准确鉴定,为表型分析和育种研究提供了重要的数据支持。然而,在自然野外环境中,由于仅rgb模式的特征表示能力有限,油菜籽检测仍然是一个重大挑战。为了解决这一挑战,本研究提出了一种基于YOLOv11n-seg的双模态实例分割网络MSNet,该网络集成了RGB和近红外(NIR)模式。该网络的主要改进包括三种不同的融合定位策略(前端融合、中期融合和后端融合)和新引入的用于多模态特征融合的分层注意融合块(HAFB)。融合位置的对比实验表明,中期融合策略在检测精度和参数效率之间达到了最佳平衡。与基线网络相比,mAP50:95的改进可达3.5%。在引入HAFB模块后,MSNet-H-HAFB模型显示,相对于基线网络,mAP50:95增加了6.5%,参数数量增加了不到38%。值得注意的是,中期融合在所有实验中始终提供了最佳的检测性能,为未来多模态网络中融合位置的选择提供了明确的设计指导。此外,与各种仅rgb实例分割模型的比较表明,所提出的MSNet-HAFB融合模型在油菜籽计数检测任务中都明显优于单模态模型,证实了多光谱融合策略在农业目标识别中的潜在优势。最后,将MSNet应用于农业案例研究,包括植被指数水平分析和霜冻灾害分类。结果表明,ZN6-2836和ZS11被预测为潜在优势品种,EVI2植被指数在油菜籽冻害分类中表现最佳。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Artificial Intelligence in Agriculture
Artificial Intelligence in Agriculture Engineering-Engineering (miscellaneous)
CiteScore
21.60
自引率
0.00%
发文量
18
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信