RGB-D相机结合改进的检测变压器模型对果园中多粒桃进行检测

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Intelligent Data Analysis Pub Date : 2023-08-07 DOI:10.3233/ida-220449

Yu Yang, Xin Wang, Zhenfang Liu, Min Huang, Shangpeng Sun, Qibing Zhu

{"title":"RGB-D相机结合改进的检测变压器模型对果园中多粒桃进行检测","authors":"Yu Yang, Xin Wang, Zhenfang Liu, Min Huang, Shangpeng Sun, Qibing Zhu","doi":"10.3233/ida-220449","DOIUrl":null,"url":null,"abstract":"The first major contribution of the paper is the proposal of using an improved DEtection Transformer network (named R2N-DETR) and Kinect-V2 camera for detecting multiple-size peaches under orchards with varied illumination and fruit occlusion. R2N-DETR model first employed Res2Net-50 to extract a fused low-high level feature map containing fine spatial features and precise semantic information of multi-size peaches from Red-Green-Blue-Depth (RGB-D) images. Second, the encoder-decoder was performed on the feature map to obtain the global context. Finally, all detected objects were detected according to each object’s global context. For the detection of 1101 RGB-D images (imaged from two orchards over three years), the R2N-DETR model achieves an average precision of 0.944 and an average detecting time of 53 ms for each image. The developed system could provide precise visual guidance for robotic picking and contribute to improving yield prediction by providing accurate fruit counting.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"27 4","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detection of multi-size peach in orchard using RGB-D camera combined with an improved DEtection Transformer model\",\"authors\":\"Yu Yang, Xin Wang, Zhenfang Liu, Min Huang, Shangpeng Sun, Qibing Zhu\",\"doi\":\"10.3233/ida-220449\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The first major contribution of the paper is the proposal of using an improved DEtection Transformer network (named R2N-DETR) and Kinect-V2 camera for detecting multiple-size peaches under orchards with varied illumination and fruit occlusion. R2N-DETR model first employed Res2Net-50 to extract a fused low-high level feature map containing fine spatial features and precise semantic information of multi-size peaches from Red-Green-Blue-Depth (RGB-D) images. Second, the encoder-decoder was performed on the feature map to obtain the global context. Finally, all detected objects were detected according to each object’s global context. For the detection of 1101 RGB-D images (imaged from two orchards over three years), the R2N-DETR model achieves an average precision of 0.944 and an average detecting time of 53 ms for each image. The developed system could provide precise visual guidance for robotic picking and contribute to improving yield prediction by providing accurate fruit counting.\",\"PeriodicalId\":50355,\"journal\":{\"name\":\"Intelligent Data Analysis\",\"volume\":\"27 4\",\"pages\":\"\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2023-08-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligent Data Analysis\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.3233/ida-220449\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Data Analysis","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/ida-220449","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

本文的第一个主要贡献是提出使用改进的检测变压器网络(命名为R2N-DETR)和Kinect-V2相机来检测不同光照和水果遮挡的果园下的多种大小的桃子。R2N-DETR模型首先利用Res2Net-50从RGB-D (Red-Green-Blue-Depth，红-绿-蓝-深)图像中提取多尺寸桃子的融合低-高层特征图，其中包含精细的空间特征和精确的语义信息。其次，对特征映射进行编码器-解码器处理，获得全局上下文;最后，根据每个对象的全局上下文对所有检测到的对象进行检测。R2N-DETR模型检测了1101张RGB-D图像(2个果园3年以上)，平均精度为0.944，平均检测时间为53 ms。所开发的系统可以为机器人采摘提供精确的视觉指导，并通过提供准确的水果计数来提高产量预测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Detection of multi-size peach in orchard using RGB-D camera combined with an improved DEtection Transformer model

The first major contribution of the paper is the proposal of using an improved DEtection Transformer network (named R2N-DETR) and Kinect-V2 camera for detecting multiple-size peaches under orchards with varied illumination and fruit occlusion. R2N-DETR model first employed Res2Net-50 to extract a fused low-high level feature map containing fine spatial features and precise semantic information of multi-size peaches from Red-Green-Blue-Depth (RGB-D) images. Second, the encoder-decoder was performed on the feature map to obtain the global context. Finally, all detected objects were detected according to each object’s global context. For the detection of 1101 RGB-D images (imaged from two orchards over three years), the R2N-DETR model achieves an average precision of 0.944 and an average detecting time of 53 ms for each image. The developed system could provide precise visual guidance for robotic picking and contribute to improving yield prediction by providing accurate fruit counting.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Intelligent Data Analysis 工程技术-计算机：人工智能

CiteScore

2.20

自引率

5.90%

发文量

审稿时长

3.3 months

期刊介绍： Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.