基于深度 CNN 的高效、轻量级农业机器人香蕉检测和定位系统

IF 6.3 Q1 AGRICULTURAL ENGINEERING
{"title":"基于深度 CNN 的高效、轻量级农业机器人香蕉检测和定位系统","authors":"","doi":"10.1016/j.atech.2024.100550","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate detection and localization of fruits in natural environments is a key step for fruit picking robots to achieve precise harvesting. However, existing banana detection and positioning methods have two main limitations in practical applications: a large number of model parameters that make deployment difficult, and a need for performance improvement. To tackle the above issues, a high-precision and lightweight banana bunch recognition and localization method was proposed and deployed on edge devices for application. First, a Slim-Banana model was proposed based on the improvement of YOLOv8l. In order to reduce the model calculation amount and maintain high performance, GSConv was introduced in the Slim-Banana model to replace the standard convolution, and combined with grouped convolution and spatial convolution. At the same time, the cross-stage local network (GSCSP) module was designed to reduce the computational complexity and the complexity of the network structure through a single-stage aggregation method. Then, the RealSense depth sensor is combined with TOF technology to perform image registration and 3D localization of the banana. Finally, the pipeline is deployed on the Nvidia Orin NX edge device and its performance and resource consumption in actual work are deeply analyzed. Experimental results show that the detection precision, recall, mAP and inference time of our method are 0.947, 0.948, 0.98 and 113.6 ms respectively, the network memory size required is 4449MiB, and the average localization errors in the X-axis, Y-axis and Z-axis directions are 13.47 mm, 12.87 mm and 13.87 mm respectively. To our knowledge, this is the first work that implements banana detection and localization on edge devices. Experimental results show that compared with existing methods, our method achieves better performance in complex orchard environments, achieving efficient and lightweight banana recognition and localization.</p></div>","PeriodicalId":74813,"journal":{"name":"Smart agricultural technology","volume":null,"pages":null},"PeriodicalIF":6.3000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772375524001552/pdfft?md5=e7498f509bd40627ea3219763c994e78&pid=1-s2.0-S2772375524001552-main.pdf","citationCount":"0","resultStr":"{\"title\":\"An efficient and lightweight banana detection and localization system based on deep CNNs for agricultural robots\",\"authors\":\"\",\"doi\":\"10.1016/j.atech.2024.100550\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Accurate detection and localization of fruits in natural environments is a key step for fruit picking robots to achieve precise harvesting. However, existing banana detection and positioning methods have two main limitations in practical applications: a large number of model parameters that make deployment difficult, and a need for performance improvement. To tackle the above issues, a high-precision and lightweight banana bunch recognition and localization method was proposed and deployed on edge devices for application. First, a Slim-Banana model was proposed based on the improvement of YOLOv8l. In order to reduce the model calculation amount and maintain high performance, GSConv was introduced in the Slim-Banana model to replace the standard convolution, and combined with grouped convolution and spatial convolution. At the same time, the cross-stage local network (GSCSP) module was designed to reduce the computational complexity and the complexity of the network structure through a single-stage aggregation method. Then, the RealSense depth sensor is combined with TOF technology to perform image registration and 3D localization of the banana. Finally, the pipeline is deployed on the Nvidia Orin NX edge device and its performance and resource consumption in actual work are deeply analyzed. Experimental results show that the detection precision, recall, mAP and inference time of our method are 0.947, 0.948, 0.98 and 113.6 ms respectively, the network memory size required is 4449MiB, and the average localization errors in the X-axis, Y-axis and Z-axis directions are 13.47 mm, 12.87 mm and 13.87 mm respectively. To our knowledge, this is the first work that implements banana detection and localization on edge devices. Experimental results show that compared with existing methods, our method achieves better performance in complex orchard environments, achieving efficient and lightweight banana recognition and localization.</p></div>\",\"PeriodicalId\":74813,\"journal\":{\"name\":\"Smart agricultural technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2024-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772375524001552/pdfft?md5=e7498f509bd40627ea3219763c994e78&pid=1-s2.0-S2772375524001552-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Smart agricultural technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772375524001552\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURAL ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart agricultural technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772375524001552","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

对自然环境中的水果进行精确检测和定位是水果采摘机器人实现精确采摘的关键步骤。然而,现有的香蕉检测和定位方法在实际应用中存在两大局限:一是模型参数较多,导致部署困难;二是性能有待提高。针对上述问题,我们提出了一种高精度、轻量级的香蕉串识别和定位方法,并在边缘设备上部署应用。首先,在对 YOLOv8l 进行改进的基础上,提出了 Slim-Banana 模型。为了减少模型计算量并保持高性能,在 Slim-Banana 模型中引入了 GSConv 来替代标准卷积,并与分组卷积和空间卷积相结合。同时,设计了跨阶段局部网络(GSCSP)模块,通过单阶段聚合方法降低计算复杂度和网络结构的复杂性。然后,将 RealSense 深度传感器与 TOF 技术相结合,对香蕉进行图像配准和三维定位。最后,在 Nvidia Orin NX 边缘设备上部署了该管道,并深入分析了其在实际工作中的性能和资源消耗。实验结果表明,我们的方法的检测精度、召回率、mAP 和推理时间分别为 0.947、0.948、0.98 和 113.6 ms,所需的网络内存大小为 4449MiB,X 轴、Y 轴和 Z 轴方向的平均定位误差分别为 13.47 mm、12.87 mm 和 13.87 mm。据我们所知,这是第一项在边缘设备上实现香蕉检测和定位的工作。实验结果表明,与现有方法相比,我们的方法在复杂的果园环境中取得了更好的性能,实现了高效、轻量级的香蕉识别和定位。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An efficient and lightweight banana detection and localization system based on deep CNNs for agricultural robots

Accurate detection and localization of fruits in natural environments is a key step for fruit picking robots to achieve precise harvesting. However, existing banana detection and positioning methods have two main limitations in practical applications: a large number of model parameters that make deployment difficult, and a need for performance improvement. To tackle the above issues, a high-precision and lightweight banana bunch recognition and localization method was proposed and deployed on edge devices for application. First, a Slim-Banana model was proposed based on the improvement of YOLOv8l. In order to reduce the model calculation amount and maintain high performance, GSConv was introduced in the Slim-Banana model to replace the standard convolution, and combined with grouped convolution and spatial convolution. At the same time, the cross-stage local network (GSCSP) module was designed to reduce the computational complexity and the complexity of the network structure through a single-stage aggregation method. Then, the RealSense depth sensor is combined with TOF technology to perform image registration and 3D localization of the banana. Finally, the pipeline is deployed on the Nvidia Orin NX edge device and its performance and resource consumption in actual work are deeply analyzed. Experimental results show that the detection precision, recall, mAP and inference time of our method are 0.947, 0.948, 0.98 and 113.6 ms respectively, the network memory size required is 4449MiB, and the average localization errors in the X-axis, Y-axis and Z-axis directions are 13.47 mm, 12.87 mm and 13.87 mm respectively. To our knowledge, this is the first work that implements banana detection and localization on edge devices. Experimental results show that compared with existing methods, our method achieves better performance in complex orchard environments, achieving efficient and lightweight banana recognition and localization.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.20
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信