Fuyang Tian , Yinuo Zhang , Shakeel Ahmed Soomro , Qiang Wang , Shuaiyang Zhang , Ji Zhang , Qinglu Yang , Yunpeng Yan , Zhenwei Yu , Zhanhua Song
{"title":"SSOD-MViT: A novel model for recognizing alfalfa seed pod maturity based on semi-supervised learning","authors":"Fuyang Tian , Yinuo Zhang , Shakeel Ahmed Soomro , Qiang Wang , Shuaiyang Zhang , Ji Zhang , Qinglu Yang , Yunpeng Yan , Zhenwei Yu , Zhanhua Song","doi":"10.1016/j.compag.2025.110439","DOIUrl":null,"url":null,"abstract":"<div><div>The current study was conducted to address the challenges of recognizing alfalfa seed pod maturity in complex field environments, and the significant impact of the quantity of labeled samples on the performance of object detection algorithms. A method for identifying the maturity of alfalfa seed pod clusters was proposed using an unmanned aerial vehicle (UAV) and a semi-supervised deep learning model SSOD-MViT (Semi-Supervised Object Detection based on the MViTNet). To enhance the model’s capability to extract key feature information, an improved lightweight general vision transformer MobileViT (Mobile Vision Transformer) was firstly employed as the backbone. The deep integration of ScConv (Spatial and Channel Reconstruction Convolution) was additionally employed to reduce redundant information within the channels, thereby decreasing the computational load of the model. Secondly, a small object detection layer was incorporated into the Neck, and the Efficient Multi-Scale Attention Module (EMA) was added to the C2f structure. The SAHI (Slicing Aided Hyper Inference) algorithm was integrated during the inference process, which improves the detection accuracy of small-sized alfalfa seed pod clusters and enhances the model’s resistance to interference. Finally, the concept of Consistency Regularization was incorporated into the model to reduce its dependency on sample data. The experimental results revealed that SSOD-MViT achieved a <em>mAP</em><sub>@0.5</sub> of 92.23 %. When compared to the YOLOv8 object detection model, the <em>mAP</em><sub>@0.5</sub> had improved by 12.31 %. When compared to the Faster R-CNN object detection model, the average detection time reduced by 175.81 ms. The proposed model MViTNet (MobileViT Network) had a storage size of 5.3 MB, and an average detection time of 82.34 ms, providing favorable conditions for subsequent deployment on embedded devices. This research effectively improved the detection performance of existing models in detecting alfalfa seed pod maturity in complex field environments. This advancement also aids in determining the optimal harvesting period for alfalfa seeds, thereby providing technical support to enhance productivity and reduce production costs in the alfalfa seed production industry.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"236 ","pages":"Article 110439"},"PeriodicalIF":7.7000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925005459","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
The current study was conducted to address the challenges of recognizing alfalfa seed pod maturity in complex field environments, and the significant impact of the quantity of labeled samples on the performance of object detection algorithms. A method for identifying the maturity of alfalfa seed pod clusters was proposed using an unmanned aerial vehicle (UAV) and a semi-supervised deep learning model SSOD-MViT (Semi-Supervised Object Detection based on the MViTNet). To enhance the model’s capability to extract key feature information, an improved lightweight general vision transformer MobileViT (Mobile Vision Transformer) was firstly employed as the backbone. The deep integration of ScConv (Spatial and Channel Reconstruction Convolution) was additionally employed to reduce redundant information within the channels, thereby decreasing the computational load of the model. Secondly, a small object detection layer was incorporated into the Neck, and the Efficient Multi-Scale Attention Module (EMA) was added to the C2f structure. The SAHI (Slicing Aided Hyper Inference) algorithm was integrated during the inference process, which improves the detection accuracy of small-sized alfalfa seed pod clusters and enhances the model’s resistance to interference. Finally, the concept of Consistency Regularization was incorporated into the model to reduce its dependency on sample data. The experimental results revealed that SSOD-MViT achieved a mAP@0.5 of 92.23 %. When compared to the YOLOv8 object detection model, the mAP@0.5 had improved by 12.31 %. When compared to the Faster R-CNN object detection model, the average detection time reduced by 175.81 ms. The proposed model MViTNet (MobileViT Network) had a storage size of 5.3 MB, and an average detection time of 82.34 ms, providing favorable conditions for subsequent deployment on embedded devices. This research effectively improved the detection performance of existing models in detecting alfalfa seed pod maturity in complex field environments. This advancement also aids in determining the optimal harvesting period for alfalfa seeds, thereby providing technical support to enhance productivity and reduce production costs in the alfalfa seed production industry.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.