一个多光谱图像驱动的油菜籽冠层实例分割网络

IF 12.4 Q1 AGRICULTURE, MULTIDISCIPLINARY

Artificial Intelligence in Agriculture Pub Date : 2025-05-31 DOI:10.1016/j.aiia.2025.05.008

Yuang Yang, Xiaole Wang, Fugui Zhang, Zhenchao Wu, Yu Wang, Yujie Liu, Xuan Lv, Bowen Luo, Liqing Chen, Yang Yang

{"title":"一个多光谱图像驱动的油菜籽冠层实例分割网络","authors":"Yuang Yang, Xiaole Wang, Fugui Zhang, Zhenchao Wu, Yu Wang, Yujie Liu, Xuan Lv, Bowen Luo, Liqing Chen, Yang Yang","doi":"10.1016/j.aiia.2025.05.008","DOIUrl":null,"url":null,"abstract":"<div><div>Precise detection of rapeseed and the growth of its canopy area are crucial phenotypic indicators of its growth status. Achieving accurate identification of the rapeseed target and its growth region provides significant data support for phenotypic analysis and breeding research. However, in natural field environments, rapeseed detection remains a substantial challenge due to the limited feature representation capabilities of RGB-only modalities. To address this challenge, this study proposes a dual-modal instance segmentation network, MSNet, based on YOLOv11n-seg, integrating both RGB and Near-Infrared (NIR) modalities. The main improvements of this network include three different fusion location strategies (frontend fusion, mid-stage fusion, and backend fusion) and the newly introduced Hierarchical Attention Fusion Block (HAFB) for multimodal feature fusion. Comparative experiments on fusion locations indicate that the mid-stage fusion strategy achieves the best balance between detection accuracy and parameter efficiency. Compared to the baseline network, the <em>mAP50:95</em> improvement can reach up to 3.5 %. After introducing the HAFB module, the MSNet-H-HAFB model demonstrates a 6.5 % increase in <em>mAP50:95</em> relative to the baseline network, with less than a 38 % increase in parameter count. It is noteworthy that the mid-stage fusion consistently delivered the best detection performance in all experiments, providing clear design guidance for selecting fusion locations in future multimodal networks. In addition, comparisons with various RGB-only instance segmentation models show that all the proposed MSNet-HAFB fusion models significantly outperform single-modal models in rapeseed count detection tasks, confirming the potential advantages of multispectral fusion strategies in agricultural target recognition. Finally, the MSNet was applied in an agricultural case study, including vegetation index level analysis and frost damage classification. The results show that ZN6–2836 and ZS11 were predicted as potential superior varieties, and the EVI2 vegetation index achieved the best performance in rapeseed frost damage classification.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"15 4","pages":"Pages 642-658"},"PeriodicalIF":12.4000,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MSNet: A multispectral-image driven rapeseed canopy instance segmentation network\",\"authors\":\"Yuang Yang, Xiaole Wang, Fugui Zhang, Zhenchao Wu, Yu Wang, Yujie Liu, Xuan Lv, Bowen Luo, Liqing Chen, Yang Yang\",\"doi\":\"10.1016/j.aiia.2025.05.008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Precise detection of rapeseed and the growth of its canopy area are crucial phenotypic indicators of its growth status. Achieving accurate identification of the rapeseed target and its growth region provides significant data support for phenotypic analysis and breeding research. However, in natural field environments, rapeseed detection remains a substantial challenge due to the limited feature representation capabilities of RGB-only modalities. To address this challenge, this study proposes a dual-modal instance segmentation network, MSNet, based on YOLOv11n-seg, integrating both RGB and Near-Infrared (NIR) modalities. The main improvements of this network include three different fusion location strategies (frontend fusion, mid-stage fusion, and backend fusion) and the newly introduced Hierarchical Attention Fusion Block (HAFB) for multimodal feature fusion. Comparative experiments on fusion locations indicate that the mid-stage fusion strategy achieves the best balance between detection accuracy and parameter efficiency. Compared to the baseline network, the <em>mAP50:95</em> improvement can reach up to 3.5 %. After introducing the HAFB module, the MSNet-H-HAFB model demonstrates a 6.5 % increase in <em>mAP50:95</em> relative to the baseline network, with less than a 38 % increase in parameter count. It is noteworthy that the mid-stage fusion consistently delivered the best detection performance in all experiments, providing clear design guidance for selecting fusion locations in future multimodal networks. In addition, comparisons with various RGB-only instance segmentation models show that all the proposed MSNet-HAFB fusion models significantly outperform single-modal models in rapeseed count detection tasks, confirming the potential advantages of multispectral fusion strategies in agricultural target recognition. Finally, the MSNet was applied in an agricultural case study, including vegetation index level analysis and frost damage classification. The results show that ZN6–2836 and ZS11 were predicted as potential superior varieties, and the EVI2 vegetation index achieved the best performance in rapeseed frost damage classification.</div></div>\",\"PeriodicalId\":52814,\"journal\":{\"name\":\"Artificial Intelligence in Agriculture\",\"volume\":\"15 4\",\"pages\":\"Pages 642-658\"},\"PeriodicalIF\":12.4000,\"publicationDate\":\"2025-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence in Agriculture\",\"FirstCategoryId\":\"1087\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2589721725000637\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Agriculture","FirstCategoryId":"1087","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589721725000637","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

油菜籽的生长状况及其冠层面积的精确检测是反映油菜籽生长状况的重要表型指标。实现油菜靶点及其生长区域的准确鉴定，为表型分析和育种研究提供了重要的数据支持。然而，在自然野外环境中，由于仅rgb模式的特征表示能力有限，油菜籽检测仍然是一个重大挑战。为了解决这一挑战，本研究提出了一种基于YOLOv11n-seg的双模态实例分割网络MSNet，该网络集成了RGB和近红外（NIR）模式。该网络的主要改进包括三种不同的融合定位策略（前端融合、中期融合和后端融合）和新引入的用于多模态特征融合的分层注意融合块（HAFB）。融合位置的对比实验表明，中期融合策略在检测精度和参数效率之间达到了最佳平衡。与基线网络相比，mAP50:95的改进可达3.5%。在引入HAFB模块后，MSNet-H-HAFB模型显示，相对于基线网络，mAP50:95增加了6.5%，参数数量增加了不到38%。值得注意的是，中期融合在所有实验中始终提供了最佳的检测性能，为未来多模态网络中融合位置的选择提供了明确的设计指导。此外，与各种仅rgb实例分割模型的比较表明，所提出的MSNet-HAFB融合模型在油菜籽计数检测任务中都明显优于单模态模型，证实了多光谱融合策略在农业目标识别中的潜在优势。最后，将MSNet应用于农业案例研究，包括植被指数水平分析和霜冻灾害分类。结果表明，ZN6-2836和ZS11被预测为潜在优势品种，EVI2植被指数在油菜籽冻害分类中表现最佳。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MSNet: A multispectral-image driven rapeseed canopy instance segmentation network

Precise detection of rapeseed and the growth of its canopy area are crucial phenotypic indicators of its growth status. Achieving accurate identification of the rapeseed target and its growth region provides significant data support for phenotypic analysis and breeding research. However, in natural field environments, rapeseed detection remains a substantial challenge due to the limited feature representation capabilities of RGB-only modalities. To address this challenge, this study proposes a dual-modal instance segmentation network, MSNet, based on YOLOv11n-seg, integrating both RGB and Near-Infrared (NIR) modalities. The main improvements of this network include three different fusion location strategies (frontend fusion, mid-stage fusion, and backend fusion) and the newly introduced Hierarchical Attention Fusion Block (HAFB) for multimodal feature fusion. Comparative experiments on fusion locations indicate that the mid-stage fusion strategy achieves the best balance between detection accuracy and parameter efficiency. Compared to the baseline network, the mAP50:95 improvement can reach up to 3.5 %. After introducing the HAFB module, the MSNet-H-HAFB model demonstrates a 6.5 % increase in mAP50:95 relative to the baseline network, with less than a 38 % increase in parameter count. It is noteworthy that the mid-stage fusion consistently delivered the best detection performance in all experiments, providing clear design guidance for selecting fusion locations in future multimodal networks. In addition, comparisons with various RGB-only instance segmentation models show that all the proposed MSNet-HAFB fusion models significantly outperform single-modal models in rapeseed count detection tasks, confirming the potential advantages of multispectral fusion strategies in agricultural target recognition. Finally, the MSNet was applied in an agricultural case study, including vegetation index level analysis and frost damage classification. The results show that ZN6–2836 and ZS11 were predicted as potential superior varieties, and the EVI2 vegetation index achieved the best performance in rapeseed frost damage classification.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Artificial Intelligence in Agriculture Engineering-Engineering (miscellaneous)

CiteScore

21.60

自引率

0.00%

发文量

审稿时长

12 weeks