Rapid Detection of Ripe Tomatoes in Unstructured Environments

IF 5.2 2区计算机科学 Q2 ROBOTICS

Journal of Field Robotics Pub Date : 2025-04-15 DOI:10.1002/rob.22556

Jiangtao Qi, Xv Cong, Weirong Zhang, Fangfang Gao, Bo Zhao, Hui Guo

{"title":"Rapid Detection of Ripe Tomatoes in Unstructured Environments","authors":"Jiangtao Qi, Xv Cong, Weirong Zhang, Fangfang Gao, Bo Zhao, Hui Guo","doi":"10.1002/rob.22556","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>To achieve efficient detection of ripe tomatoes in unstructured environments, this paper proposed an improved YOLOv7 rapid detection network model for ripe tomatoes. Firstly, the original YOLOv7 backbone network's CSP-Darknet53 structure was replaced by the FasterNet network structure to enhance model detection efficiency and reduce the parameters of the model. Secondly, the Global Attention Mechanism (GAM) was introduced to improve the tomato feature expression ability with a small increase in model parameters. Next, a Diverse Branch Block (DBB) module was integrated into the ELAN module in the head structure to improve the model's inference efficiency. Finally, the batch normalization layer <i>γ</i> was selected as the parameter of the sparsity factor in the algorithm. The L<sub>1</sub> regularization term was used to train the original model for sparsity, and the slim pruning algorithm was used for global channel pruning to compress the model size. The pruned model was retrained through model fine-tuning to adjust the detection accuracy to near the level before pruning. The experimental results show that the improved model has a mean average precision of 96.49%, which is basically unchanged compared to the original model. However, the model parameter count, the computation, and the model size were reduced by 52.16%, 56.84%, and 36.95%, respectively, resulting in a 32.09% increase in the recognition frame rate. Compared to similar object detection models, such as SSD, YOLOv3, YOLOv4, YOLOv5s, YOLOX, and YOLOv8, the Improved-YOLOv7 model reduced the parameter by 4.44% to 89.05%, computational complexity by 30.37% to 91.18%, and model size by 26.43% to 72.16%. This paper provided technical support for the recognition of ripe tomatoes in unstructured environments.</p>\n </div>","PeriodicalId":192,"journal":{"name":"Journal of Field Robotics","volume":"42 6","pages":"2920-2935"},"PeriodicalIF":5.2000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Field Robotics","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/rob.22556","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

Abstract

To achieve efficient detection of ripe tomatoes in unstructured environments, this paper proposed an improved YOLOv7 rapid detection network model for ripe tomatoes. Firstly, the original YOLOv7 backbone network's CSP-Darknet53 structure was replaced by the FasterNet network structure to enhance model detection efficiency and reduce the parameters of the model. Secondly, the Global Attention Mechanism (GAM) was introduced to improve the tomato feature expression ability with a small increase in model parameters. Next, a Diverse Branch Block (DBB) module was integrated into the ELAN module in the head structure to improve the model's inference efficiency. Finally, the batch normalization layer γ was selected as the parameter of the sparsity factor in the algorithm. The L₁ regularization term was used to train the original model for sparsity, and the slim pruning algorithm was used for global channel pruning to compress the model size. The pruned model was retrained through model fine-tuning to adjust the detection accuracy to near the level before pruning. The experimental results show that the improved model has a mean average precision of 96.49%, which is basically unchanged compared to the original model. However, the model parameter count, the computation, and the model size were reduced by 52.16%, 56.84%, and 36.95%, respectively, resulting in a 32.09% increase in the recognition frame rate. Compared to similar object detection models, such as SSD, YOLOv3, YOLOv4, YOLOv5s, YOLOX, and YOLOv8, the Improved-YOLOv7 model reduced the parameter by 4.44% to 89.05%, computational complexity by 30.37% to 91.18%, and model size by 26.43% to 72.16%. This paper provided technical support for the recognition of ripe tomatoes in unstructured environments.

查看原文本刊更多论文

非结构化环境中成熟番茄的快速检测

为实现非结构化环境下成熟番茄的高效检测，本文提出了一种改进的YOLOv7成熟番茄快速检测网络模型。首先，将原有的YOLOv7骨干网CSP-Darknet53结构替换为FasterNet网络结构，提高模型检测效率，减少模型参数；其次，引入全局注意机制（Global Attention Mechanism， GAM），在少量增加模型参数的情况下提高番茄特征表达能力；其次，在头部结构的ELAN模块中集成了多元分支块（DBB）模块，以提高模型的推理效率。最后，选择批归一化层γ作为稀疏度因子的参数。采用L1正则化项对原始模型进行稀疏性训练，采用slim剪枝算法对全局信道进行剪枝，压缩模型大小。通过模型微调对剪枝后的模型进行再训练，使检测精度接近剪枝前的水平。实验结果表明，改进后的模型平均精度为96.49%，与原模型基本持平。然而，模型参数数、计算量和模型尺寸分别减少了52.16%、56.84%和36.95%，识别帧率提高了32.09%。与SSD、YOLOv3、YOLOv4、YOLOv5s、YOLOX、YOLOv8等同类目标检测模型相比，改进的- yolov7模型参数降低4.44%至89.05%，计算复杂度降低30.37%至91.18%，模型尺寸降低26.43%至72.16%。本文为非结构化环境中成熟番茄的识别提供了技术支持。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Field Robotics 工程技术-机器人学

CiteScore

15.00

自引率

3.60%

发文量

审稿时长

6 months

期刊介绍： The Journal of Field Robotics seeks to promote scholarly publications dealing with the fundamentals of robotics in unstructured and dynamic environments. The Journal focuses on experimental robotics and encourages publication of work that has both theoretical and practical significance.