{"title":"油茶采摘机器人的通用视觉检测方法","authors":"Jinpeng Wang, Fei Yuan, Jialiang Zhou, Meng He, Qianguang Zhen, Chenzhe Fang, Sunan Chen, Hongping Zhou","doi":"10.1002/rob.22518","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>In recent years, the application of robots in the field of fruit picking has steadily increased. Mechanized methods for harvesting oil tea fruits include comb picking, vibratory picking, and gripping picking, among others. Traditional reliance on a single-picking method is limited by variability in fruit size, shading, and environmental conditions. To develop a universal vision system suitable for picking robots capable of multiple picking methods and achieve intelligent harvesting of oil tea fruits. This paper proposes an enhanced You Only Look Once v7 (YOLOv7)-based oil tea fruits recognition method specifically designed for subsequent clamp or comb picking. The network's feature extraction capability is enhanced by incorporating an attention mechanism, an optimized small target detection layer, and an improved training loss function, thereby improving its detection of occluded and small target fruits. An innovative Automatic Assignment (AA) method clusters and subclusters the detected oil tea fruits, providing crucial fruit distribution data to optimize the robot's picking strategy. Additionally, for vibration harvesting, this paper introduces a vibration point detection method utilizing the Pyramid Scene Parsing Network (PSPNet) semantic segmentation network combined with connectivity domain analysis to identify vibration points on the trunks and branches of oil tea trees. Experimental results demonstrate that the generalized visual detection method proposed in this study surpasses existing models in identifying oil tea fruit trees, with the enhanced YOLOv7 model achieving mean average precision, recall, and accuracy of 91.7%, 94.0%, and 94.9%, respectively. The AA method achieves clustering and subclustering of oil tea fruits with a processing delay of under 5 ms. For vibration harvesting, PSPNet achieves branch segmentation precision, recall, and intersection ratio of 97.3%, 96.5%, and 94.5%, respectively. The proposed branch vibration point detection method attains a detection accuracy of 93%, effectively pinpointing vibration points on the trunks and branches of oil tea trees. Overall, the proposed visual method can be implemented in robots using various picking techniques to enable automated harvesting.</p>\n </div>","PeriodicalId":192,"journal":{"name":"Journal of Field Robotics","volume":"42 5","pages":"2280-2296"},"PeriodicalIF":5.2000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Universal Visual Detection Method for Camellia oleifera Fruit Picking Robot\",\"authors\":\"Jinpeng Wang, Fei Yuan, Jialiang Zhou, Meng He, Qianguang Zhen, Chenzhe Fang, Sunan Chen, Hongping Zhou\",\"doi\":\"10.1002/rob.22518\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>In recent years, the application of robots in the field of fruit picking has steadily increased. Mechanized methods for harvesting oil tea fruits include comb picking, vibratory picking, and gripping picking, among others. Traditional reliance on a single-picking method is limited by variability in fruit size, shading, and environmental conditions. To develop a universal vision system suitable for picking robots capable of multiple picking methods and achieve intelligent harvesting of oil tea fruits. This paper proposes an enhanced You Only Look Once v7 (YOLOv7)-based oil tea fruits recognition method specifically designed for subsequent clamp or comb picking. The network's feature extraction capability is enhanced by incorporating an attention mechanism, an optimized small target detection layer, and an improved training loss function, thereby improving its detection of occluded and small target fruits. An innovative Automatic Assignment (AA) method clusters and subclusters the detected oil tea fruits, providing crucial fruit distribution data to optimize the robot's picking strategy. Additionally, for vibration harvesting, this paper introduces a vibration point detection method utilizing the Pyramid Scene Parsing Network (PSPNet) semantic segmentation network combined with connectivity domain analysis to identify vibration points on the trunks and branches of oil tea trees. Experimental results demonstrate that the generalized visual detection method proposed in this study surpasses existing models in identifying oil tea fruit trees, with the enhanced YOLOv7 model achieving mean average precision, recall, and accuracy of 91.7%, 94.0%, and 94.9%, respectively. The AA method achieves clustering and subclustering of oil tea fruits with a processing delay of under 5 ms. For vibration harvesting, PSPNet achieves branch segmentation precision, recall, and intersection ratio of 97.3%, 96.5%, and 94.5%, respectively. The proposed branch vibration point detection method attains a detection accuracy of 93%, effectively pinpointing vibration points on the trunks and branches of oil tea trees. Overall, the proposed visual method can be implemented in robots using various picking techniques to enable automated harvesting.</p>\\n </div>\",\"PeriodicalId\":192,\"journal\":{\"name\":\"Journal of Field Robotics\",\"volume\":\"42 5\",\"pages\":\"2280-2296\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2025-01-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Field Robotics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/rob.22518\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Field Robotics","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/rob.22518","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0
摘要
近年来,机器人在水果采摘领域的应用稳步增加。油茶果实的机械化采收方法包括梳式采摘、振动采摘、夹持采摘等。传统的单一采摘方法受限于果实大小、阴影和环境条件的变化。开发一种适用于采摘机器人的通用视觉系统,能够多种采摘方式,实现油茶果实的智能采摘。本文提出了一种基于YOLOv7 (You Only Look Once)的油茶果实识别方法,专门用于后续的钳式或梳式采摘。通过引入注意机制、优化的小目标检测层和改进的训练损失函数,增强了网络的特征提取能力,从而提高了网络对遮挡和小目标果实的检测能力。一种创新的自动分配(AA)方法对检测到的油茶果实进行聚类和子聚类,为优化机器人的采摘策略提供关键的果实分布数据。此外,在振动采集方面,本文介绍了一种利用金字塔场景解析网络(PSPNet)语义分割网络结合连通性域分析识别油茶树树干和树枝振动点的振动点检测方法。实验结果表明,本文提出的广义视觉检测方法在油茶果树识别方面优于现有模型,增强后的YOLOv7模型的平均精密度、召回率和正确率分别达到91.7%、94.0%和94.9%。AA法实现了油茶果实的聚类和亚聚类,加工延迟在5 ms以下。对于振动采集,PSPNet的分支分割精度、查全率和相交率分别达到97.3%、96.5%和94.5%。所提出的树枝振动点检测方法检测精度达93%,能有效定位油茶树树干和树枝上的振动点。总的来说,提出的视觉方法可以在机器人中实现,使用各种采摘技术来实现自动收获。
A Universal Visual Detection Method for Camellia oleifera Fruit Picking Robot
In recent years, the application of robots in the field of fruit picking has steadily increased. Mechanized methods for harvesting oil tea fruits include comb picking, vibratory picking, and gripping picking, among others. Traditional reliance on a single-picking method is limited by variability in fruit size, shading, and environmental conditions. To develop a universal vision system suitable for picking robots capable of multiple picking methods and achieve intelligent harvesting of oil tea fruits. This paper proposes an enhanced You Only Look Once v7 (YOLOv7)-based oil tea fruits recognition method specifically designed for subsequent clamp or comb picking. The network's feature extraction capability is enhanced by incorporating an attention mechanism, an optimized small target detection layer, and an improved training loss function, thereby improving its detection of occluded and small target fruits. An innovative Automatic Assignment (AA) method clusters and subclusters the detected oil tea fruits, providing crucial fruit distribution data to optimize the robot's picking strategy. Additionally, for vibration harvesting, this paper introduces a vibration point detection method utilizing the Pyramid Scene Parsing Network (PSPNet) semantic segmentation network combined with connectivity domain analysis to identify vibration points on the trunks and branches of oil tea trees. Experimental results demonstrate that the generalized visual detection method proposed in this study surpasses existing models in identifying oil tea fruit trees, with the enhanced YOLOv7 model achieving mean average precision, recall, and accuracy of 91.7%, 94.0%, and 94.9%, respectively. The AA method achieves clustering and subclustering of oil tea fruits with a processing delay of under 5 ms. For vibration harvesting, PSPNet achieves branch segmentation precision, recall, and intersection ratio of 97.3%, 96.5%, and 94.5%, respectively. The proposed branch vibration point detection method attains a detection accuracy of 93%, effectively pinpointing vibration points on the trunks and branches of oil tea trees. Overall, the proposed visual method can be implemented in robots using various picking techniques to enable automated harvesting.
期刊介绍:
The Journal of Field Robotics seeks to promote scholarly publications dealing with the fundamentals of robotics in unstructured and dynamic environments.
The Journal focuses on experimental robotics and encourages publication of work that has both theoretical and practical significance.