CO-YOLO：一种轻量级、高效的油茶果实目标检测与姿态确定模型

IF 8.9 1区农林科学 Q1 AGRICULTURE, MULTIDISCIPLINARY

Computers and Electronics in Agriculture Pub Date : 2025-04-12 DOI:10.1016/j.compag.2025.110394

Shouxiang Jin , Lei Zhou , Hongping Zhou

{"title":"CO-YOLO：一种轻量级、高效的油茶果实目标检测与姿态确定模型","authors":"Shouxiang Jin , Lei Zhou , Hongping Zhou","doi":"10.1016/j.compag.2025.110394","DOIUrl":null,"url":null,"abstract":"<div><div>The complex growth patterns of <em>Camellia oleifera</em> fruits in natural environments pose significant challenges for harvesting robots. Conventional solutions often rely on complex 3D point cloud processing to detect these growth patterns and enable robotic harvesting. In this study, a method for recognizing the growth patterns of <em>Camellia oleifera</em> fruits in 2D images is proposed. To address the challenges of precise posture detection, the fruits are categorized into five types based on their growth patterns: Front, Up, Down, Left, and Right. Occluded fruits are classified separately, and a dedicated dataset for posture recognition is created. Furthermore, a posture detection model, CO-YOLO, based on the YOLO11n architecture, is introduced. The Multi-scale Aggregation Attention (MMA) module replaces the original C3f2 module, enabling the fusion of feature information across multiple scales, which enhances the model’s perceptual capabilities and improves posture recognition accuracy. Additionally, the Depth Pointwise Convolutional (DPW) module is introduced to replace standard convolutions in the backbone and neck networks, enabling better fusion of channel features, enhancing the representation of posture features in the target region, and reducing the number of parameters. Experimental results show that CO-YOLO achieves a precision of 90.6 %, a recall of 87.0 %, and an [email protected] of 93.7 %. Compared to YOLO11s, CO-YOLO reduces model size and computational complexity by 77.1 % and 69.9 %, respectively, while improving [email protected] by 4.8 %. Compared to YOLOv7-tiny, YOLOv9s, and YOLOv10s, CO-YOLO achieves increases in [email protected] of 3.9 %, 5.6 %, and 5.5 %, respectively. Heatmaps generated by CO-YOLO and YOLO11n indicate that these enhancements significantly improve posture recognition. In summary, the CO-YOLO model exhibits strong performance and provides valuable insights for advancing fruit-picking robotics.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"235 ","pages":"Article 110394"},"PeriodicalIF":8.9000,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CO-YOLO: A lightweight and efficient model for Camellia oleifera fruit object detection and posture determination\",\"authors\":\"Shouxiang Jin , Lei Zhou , Hongping Zhou\",\"doi\":\"10.1016/j.compag.2025.110394\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The complex growth patterns of <em>Camellia oleifera</em> fruits in natural environments pose significant challenges for harvesting robots. Conventional solutions often rely on complex 3D point cloud processing to detect these growth patterns and enable robotic harvesting. In this study, a method for recognizing the growth patterns of <em>Camellia oleifera</em> fruits in 2D images is proposed. To address the challenges of precise posture detection, the fruits are categorized into five types based on their growth patterns: Front, Up, Down, Left, and Right. Occluded fruits are classified separately, and a dedicated dataset for posture recognition is created. Furthermore, a posture detection model, CO-YOLO, based on the YOLO11n architecture, is introduced. The Multi-scale Aggregation Attention (MMA) module replaces the original C3f2 module, enabling the fusion of feature information across multiple scales, which enhances the model’s perceptual capabilities and improves posture recognition accuracy. Additionally, the Depth Pointwise Convolutional (DPW) module is introduced to replace standard convolutions in the backbone and neck networks, enabling better fusion of channel features, enhancing the representation of posture features in the target region, and reducing the number of parameters. Experimental results show that CO-YOLO achieves a precision of 90.6 %, a recall of 87.0 %, and an [email protected] of 93.7 %. Compared to YOLO11s, CO-YOLO reduces model size and computational complexity by 77.1 % and 69.9 %, respectively, while improving [email protected] by 4.8 %. Compared to YOLOv7-tiny, YOLOv9s, and YOLOv10s, CO-YOLO achieves increases in [email protected] of 3.9 %, 5.6 %, and 5.5 %, respectively. Heatmaps generated by CO-YOLO and YOLO11n indicate that these enhancements significantly improve posture recognition. In summary, the CO-YOLO model exhibits strong performance and provides valuable insights for advancing fruit-picking robotics.</div></div>\",\"PeriodicalId\":50627,\"journal\":{\"name\":\"Computers and Electronics in Agriculture\",\"volume\":\"235 \",\"pages\":\"Article 110394\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-04-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers and Electronics in Agriculture\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0168169925005009\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925005009","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

油茶果实在自然环境中复杂的生长模式对收获机器人提出了重大挑战。传统的解决方案通常依赖于复杂的3D点云处理来检测这些生长模式，并启用机器人采集。本研究提出了一种二维图像中油茶果实生长模式的识别方法。为了解决精确姿势检测的挑战，水果根据其生长模式分为五种类型：前、上、下、左和右。对遮挡的水果进行单独分类，并创建用于姿态识别的专用数据集。在此基础上，提出了基于YOLO11n结构的姿态检测模型CO-YOLO。多尺度聚集注意（Multi-scale Aggregation Attention， MMA）模块取代了原有的C3f2模块，实现了多尺度特征信息的融合，增强了模型的感知能力，提高了姿态识别的精度。此外，引入深度点卷积（Depth Pointwise Convolutional， DPW）模块来取代主干网络和颈部网络中的标准卷积，更好地融合了信道特征，增强了目标区域姿态特征的表征，减少了参数的数量。实验结果表明，CO-YOLO的准确率为90.6%，召回率为87.0%，[email protected]的召回率为93.7%。与yolo11相比，CO-YOLO将模型大小和计算复杂度分别降低了77.1%和69.9%，而将[email protected]提高了4.8%。与YOLOv7-tiny、YOLOv9s和YOLOv10s相比，CO-YOLO的[email protected]分别增加了3.9%、5.6%和5.5%。由CO-YOLO和YOLO11n生成的热图表明，这些增强显著提高了姿势识别。总之，CO-YOLO模型表现出强大的性能，为推进水果采摘机器人提供了有价值的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

CO-YOLO: A lightweight and efficient model for Camellia oleifera fruit object detection and posture determination

The complex growth patterns of Camellia oleifera fruits in natural environments pose significant challenges for harvesting robots. Conventional solutions often rely on complex 3D point cloud processing to detect these growth patterns and enable robotic harvesting. In this study, a method for recognizing the growth patterns of Camellia oleifera fruits in 2D images is proposed. To address the challenges of precise posture detection, the fruits are categorized into five types based on their growth patterns: Front, Up, Down, Left, and Right. Occluded fruits are classified separately, and a dedicated dataset for posture recognition is created. Furthermore, a posture detection model, CO-YOLO, based on the YOLO11n architecture, is introduced. The Multi-scale Aggregation Attention (MMA) module replaces the original C3f2 module, enabling the fusion of feature information across multiple scales, which enhances the model’s perceptual capabilities and improves posture recognition accuracy. Additionally, the Depth Pointwise Convolutional (DPW) module is introduced to replace standard convolutions in the backbone and neck networks, enabling better fusion of channel features, enhancing the representation of posture features in the target region, and reducing the number of parameters. Experimental results show that CO-YOLO achieves a precision of 90.6 %, a recall of 87.0 %, and an [email protected] of 93.7 %. Compared to YOLO11s, CO-YOLO reduces model size and computational complexity by 77.1 % and 69.9 %, respectively, while improving [email protected] by 4.8 %. Compared to YOLOv7-tiny, YOLOv9s, and YOLOv10s, CO-YOLO achieves increases in [email protected] of 3.9 %, 5.6 %, and 5.5 %, respectively. Heatmaps generated by CO-YOLO and YOLO11n indicate that these enhancements significantly improve posture recognition. In summary, the CO-YOLO model exhibits strong performance and provides valuable insights for advancing fruit-picking robotics.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers and Electronics in Agriculture 工程技术-计算机：跨学科应用

CiteScore

15.30

自引率

14.50%

发文量

800

审稿时长

62 days

期刊介绍： Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.