SSOD-MViT: A novel model for recognizing alfalfa seed pod maturity based on semi-supervised learning

IF 7.7 1区农林科学 Q1 AGRICULTURE, MULTIDISCIPLINARY

Computers and Electronics in Agriculture Pub Date : 2025-04-23 DOI:10.1016/j.compag.2025.110439

Fuyang Tian , Yinuo Zhang , Shakeel Ahmed Soomro , Qiang Wang , Shuaiyang Zhang , Ji Zhang , Qinglu Yang , Yunpeng Yan , Zhenwei Yu , Zhanhua Song

{"title":"SSOD-MViT: A novel model for recognizing alfalfa seed pod maturity based on semi-supervised learning","authors":"Fuyang Tian , Yinuo Zhang , Shakeel Ahmed Soomro , Qiang Wang , Shuaiyang Zhang , Ji Zhang , Qinglu Yang , Yunpeng Yan , Zhenwei Yu , Zhanhua Song","doi":"10.1016/j.compag.2025.110439","DOIUrl":null,"url":null,"abstract":"<div><div>The current study was conducted to address the challenges of recognizing alfalfa seed pod maturity in complex field environments, and the significant impact of the quantity of labeled samples on the performance of object detection algorithms. A method for identifying the maturity of alfalfa seed pod clusters was proposed using an unmanned aerial vehicle (UAV) and a semi-supervised deep learning model SSOD-MViT (Semi-Supervised Object Detection based on the MViTNet). To enhance the model’s capability to extract key feature information, an improved lightweight general vision transformer MobileViT (Mobile Vision Transformer) was firstly employed as the backbone. The deep integration of ScConv (Spatial and Channel Reconstruction Convolution) was additionally employed to reduce redundant information within the channels, thereby decreasing the computational load of the model. Secondly, a small object detection layer was incorporated into the Neck, and the Efficient Multi-Scale Attention Module (EMA) was added to the C2f structure. The SAHI (Slicing Aided Hyper Inference) algorithm was integrated during the inference process, which improves the detection accuracy of small-sized alfalfa seed pod clusters and enhances the model’s resistance to interference. Finally, the concept of Consistency Regularization was incorporated into the model to reduce its dependency on sample data. The experimental results revealed that SSOD-MViT achieved a <em>mAP</em><sub>@0.5</sub> of 92.23 %. When compared to the YOLOv8 object detection model, the <em>mAP</em><sub>@0.5</sub> had improved by 12.31 %. When compared to the Faster R-CNN object detection model, the average detection time reduced by 175.81 ms. The proposed model MViTNet (MobileViT Network) had a storage size of 5.3 MB, and an average detection time of 82.34 ms, providing favorable conditions for subsequent deployment on embedded devices. This research effectively improved the detection performance of existing models in detecting alfalfa seed pod maturity in complex field environments. This advancement also aids in determining the optimal harvesting period for alfalfa seeds, thereby providing technical support to enhance productivity and reduce production costs in the alfalfa seed production industry.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"236 ","pages":"Article 110439"},"PeriodicalIF":7.7000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925005459","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

The current study was conducted to address the challenges of recognizing alfalfa seed pod maturity in complex field environments, and the significant impact of the quantity of labeled samples on the performance of object detection algorithms. A method for identifying the maturity of alfalfa seed pod clusters was proposed using an unmanned aerial vehicle (UAV) and a semi-supervised deep learning model SSOD-MViT (Semi-Supervised Object Detection based on the MViTNet). To enhance the model’s capability to extract key feature information, an improved lightweight general vision transformer MobileViT (Mobile Vision Transformer) was firstly employed as the backbone. The deep integration of ScConv (Spatial and Channel Reconstruction Convolution) was additionally employed to reduce redundant information within the channels, thereby decreasing the computational load of the model. Secondly, a small object detection layer was incorporated into the Neck, and the Efficient Multi-Scale Attention Module (EMA) was added to the C2f structure. The SAHI (Slicing Aided Hyper Inference) algorithm was integrated during the inference process, which improves the detection accuracy of small-sized alfalfa seed pod clusters and enhances the model’s resistance to interference. Finally, the concept of Consistency Regularization was incorporated into the model to reduce its dependency on sample data. The experimental results revealed that SSOD-MViT achieved a mAP_@0.5 of 92.23 %. When compared to the YOLOv8 object detection model, the mAP_@0.5 had improved by 12.31 %. When compared to the Faster R-CNN object detection model, the average detection time reduced by 175.81 ms. The proposed model MViTNet (MobileViT Network) had a storage size of 5.3 MB, and an average detection time of 82.34 ms, providing favorable conditions for subsequent deployment on embedded devices. This research effectively improved the detection performance of existing models in detecting alfalfa seed pod maturity in complex field environments. This advancement also aids in determining the optimal harvesting period for alfalfa seeds, thereby providing technical support to enhance productivity and reduce production costs in the alfalfa seed production industry.

查看原文本刊更多论文

基于半监督学习的紫花苜蓿种子荚成熟度识别新模型

本研究旨在解决在复杂的田间环境中识别苜蓿种子荚果成熟度的挑战，以及标记样本数量对目标检测算法性能的重大影响。提出了一种利用无人机和半监督深度学习模型SSOD-MViT （semi-supervised Object Detection based on the MViTNet）识别苜蓿种子荚簇成熟度的方法。为了增强模型提取关键特征信息的能力，首先采用改进的轻型通用视觉变压器MobileViT （Mobile vision transformer）作为主干。此外，还采用深度融合ScConv （Spatial and Channel Reconstruction Convolution）来减少通道内的冗余信息，从而降低模型的计算负荷。其次，在颈部中加入小目标检测层，在C2f结构中加入高效多尺度注意模块（EMA）；在推理过程中集成了SAHI （Slicing Aided Hyper Inference，切片辅助超推理）算法，提高了小尺寸苜蓿种子荚簇的检测精度，增强了模型的抗干扰能力。最后，将一致性正则化的概念引入到模型中，以减少模型对样本数据的依赖。实验结果表明，SSOD-MViT达到了mAP@0.5的92.23%。与YOLOv8目标检测模型相比，mAP@0.5提高了12.31%。与Faster R-CNN目标检测模型相比，平均检测时间减少了175.81 ms。所提出的模型MViTNet （MobileViT Network）存储容量为5.3 MB，平均检测时间为82.34 ms，为后续在嵌入式设备上的部署提供了有利条件。本研究有效地提高了现有模型在复杂田间环境下对苜蓿种子荚果成熟度的检测性能。这一进展还有助于确定苜蓿种子的最佳采收期，从而为提高苜蓿种子生产行业的生产力和降低生产成本提供技术支持。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers and Electronics in Agriculture 工程技术-计算机：跨学科应用

CiteScore

15.30

自引率

14.50%

发文量

800

审稿时长

62 days

期刊介绍： Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.