一种基于多模态变压器和传感器融合的轻量化苹果病害分割方法

IF 8.9 1区 农林科学 Q1 AGRICULTURE, MULTIDISCIPLINARY
Yihong Song , Manzhou Li , Zizhe Zhou , Jiahe Zhang , Xiangge Du , Min Dong , Qinhong Jiang , Che Li , Yuantao Hu , Qiulin Yu , Dongmei Wang , Hegan Dong , Shuo Yan
{"title":"一种基于多模态变压器和传感器融合的轻量化苹果病害分割方法","authors":"Yihong Song ,&nbsp;Manzhou Li ,&nbsp;Zizhe Zhou ,&nbsp;Jiahe Zhang ,&nbsp;Xiangge Du ,&nbsp;Min Dong ,&nbsp;Qinhong Jiang ,&nbsp;Che Li ,&nbsp;Yuantao Hu ,&nbsp;Qiulin Yu ,&nbsp;Dongmei Wang ,&nbsp;Hegan Dong ,&nbsp;Shuo Yan","doi":"10.1016/j.compag.2025.110737","DOIUrl":null,"url":null,"abstract":"<div><div>To address the challenges of multimodal data fusion, low deployment efficiency, and inadequate recognition robustness in complex environments for fruit tree disease segmentation and severity classification, a multimodal parallel transformer-based framework was proposed for apple disease recognition and grading. This method integrates image data with multi-dimensional environmental sensor information. An image segmentation preprocessing module was incorporated to enhance lesion region representation, while a cross-scale attention mechanism and a frame-wise diffusion module were introduced to improve robustness under challenging backgrounds. Additionally, pruning, quantization, and knowledge distillation techniques were employed to enable lightweight deployment. Experimental results demonstrated that the full model achieved outstanding performance on apple disease recognition tasks, reaching a precision of 0.98, recall of 0.93, F1-score of 0.95, and accuracy of 0.96, surpassing several state-of-the-art methods including Mask R-CNN, SegFormer, and Swin Transformer. After compression, the model size was reduced to 76.4 MB, and computational complexity decreased to 6.1 G, enabling real-time inference speeds of 25.2 FPS and 39.6 FPS on Jetson Xavier and Orin platforms, respectively. Ablation studies confirmed the performance contributions of the segmentation preprocessing, sensor fusion, and diffusion modules, demonstrating the potential of the proposed framework for deployment in resource-constrained agricultural scenarios.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"237 ","pages":"Article 110737"},"PeriodicalIF":8.9000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A lightweight method for apple disease segmentation using multimodal transformer and sensor fusion\",\"authors\":\"Yihong Song ,&nbsp;Manzhou Li ,&nbsp;Zizhe Zhou ,&nbsp;Jiahe Zhang ,&nbsp;Xiangge Du ,&nbsp;Min Dong ,&nbsp;Qinhong Jiang ,&nbsp;Che Li ,&nbsp;Yuantao Hu ,&nbsp;Qiulin Yu ,&nbsp;Dongmei Wang ,&nbsp;Hegan Dong ,&nbsp;Shuo Yan\",\"doi\":\"10.1016/j.compag.2025.110737\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>To address the challenges of multimodal data fusion, low deployment efficiency, and inadequate recognition robustness in complex environments for fruit tree disease segmentation and severity classification, a multimodal parallel transformer-based framework was proposed for apple disease recognition and grading. This method integrates image data with multi-dimensional environmental sensor information. An image segmentation preprocessing module was incorporated to enhance lesion region representation, while a cross-scale attention mechanism and a frame-wise diffusion module were introduced to improve robustness under challenging backgrounds. Additionally, pruning, quantization, and knowledge distillation techniques were employed to enable lightweight deployment. Experimental results demonstrated that the full model achieved outstanding performance on apple disease recognition tasks, reaching a precision of 0.98, recall of 0.93, F1-score of 0.95, and accuracy of 0.96, surpassing several state-of-the-art methods including Mask R-CNN, SegFormer, and Swin Transformer. After compression, the model size was reduced to 76.4 MB, and computational complexity decreased to 6.1 G, enabling real-time inference speeds of 25.2 FPS and 39.6 FPS on Jetson Xavier and Orin platforms, respectively. Ablation studies confirmed the performance contributions of the segmentation preprocessing, sensor fusion, and diffusion modules, demonstrating the potential of the proposed framework for deployment in resource-constrained agricultural scenarios.</div></div>\",\"PeriodicalId\":50627,\"journal\":{\"name\":\"Computers and Electronics in Agriculture\",\"volume\":\"237 \",\"pages\":\"Article 110737\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers and Electronics in Agriculture\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0168169925008439\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925008439","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

针对多模态数据融合、部署效率低以及复杂环境下识别鲁棒性不足等问题,提出了一种基于多模态并联变压器的苹果病害识别分级框架。该方法将图像数据与多维环境传感器信息相结合。引入图像分割预处理模块增强病灶区域表征,引入跨尺度注意机制和帧扩散模块提高具有挑战性背景下的鲁棒性。此外,还使用了修剪、量化和知识蒸馏技术来实现轻量级部署。实验结果表明,完整模型在苹果病害识别任务上取得了优异的表现,准确率为0.98,召回率为0.93,F1-score为0.95,准确率为0.96,超过了Mask R-CNN、SegFormer、Swin Transformer等几种最先进的方法。压缩后,模型大小降至76.4 MB,计算复杂度降至6.1 G,在Jetson Xavier和Orin平台上的实时推理速度分别为25.2 FPS和39.6 FPS。消融研究证实了分割预处理、传感器融合和扩散模块对性能的贡献,证明了该框架在资源受限的农业场景中部署的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A lightweight method for apple disease segmentation using multimodal transformer and sensor fusion
To address the challenges of multimodal data fusion, low deployment efficiency, and inadequate recognition robustness in complex environments for fruit tree disease segmentation and severity classification, a multimodal parallel transformer-based framework was proposed for apple disease recognition and grading. This method integrates image data with multi-dimensional environmental sensor information. An image segmentation preprocessing module was incorporated to enhance lesion region representation, while a cross-scale attention mechanism and a frame-wise diffusion module were introduced to improve robustness under challenging backgrounds. Additionally, pruning, quantization, and knowledge distillation techniques were employed to enable lightweight deployment. Experimental results demonstrated that the full model achieved outstanding performance on apple disease recognition tasks, reaching a precision of 0.98, recall of 0.93, F1-score of 0.95, and accuracy of 0.96, surpassing several state-of-the-art methods including Mask R-CNN, SegFormer, and Swin Transformer. After compression, the model size was reduced to 76.4 MB, and computational complexity decreased to 6.1 G, enabling real-time inference speeds of 25.2 FPS and 39.6 FPS on Jetson Xavier and Orin platforms, respectively. Ablation studies confirmed the performance contributions of the segmentation preprocessing, sensor fusion, and diffusion modules, demonstrating the potential of the proposed framework for deployment in resource-constrained agricultural scenarios.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers and Electronics in Agriculture
Computers and Electronics in Agriculture 工程技术-计算机:跨学科应用
CiteScore
15.30
自引率
14.50%
发文量
800
审稿时长
62 days
期刊介绍: Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信