MFPN-6D: RGB图像上对象的实时单阶段姿态估计

2021 IEEE International Conference on Robotics and Automation (ICRA) Pub Date : 2021-05-30 DOI:10.1109/ICRA48506.2021.9561878

Penglei Liu, Qieshi Zhang, Jin Zhang, Fei Wang, Jun Cheng

{"title":"MFPN-6D: RGB图像上对象的实时单阶段姿态估计","authors":"Penglei Liu, Qieshi Zhang, Jin Zhang, Fei Wang, Jun Cheng","doi":"10.1109/ICRA48506.2021.9561878","DOIUrl":null,"url":null,"abstract":"6D pose estimation of objects is an important part of robot grasping. The latest research trend on 6D pose estimation is to train a deep neural network to directly predict the 2D projection position of the 3D key points from the image, establish the corresponding relationship, and finally use Pespective-n-Point (PnP) algorithm performs pose estimation. The current challenge of pose estimation is that when the object texture-less, occluded and scene clutter, the detection accuracy will be reduced, and most of the existing algorithm models are large and cannot take the real-time requirements. In this paper, we introduce a Multi-directional Feature Pyramid Network, MFPN, which can efficiently integrate and utilize features. We combined the Cross Stage Partial Network (CSPNet) with MFPN to design a new network for 6D pose estimation, MFPN-6D. At the same time, we propose a new confidence calculation method for object pose estimation, which can fully consider spatial information and plane information. At last, we tested our method on the LINEMOD and Occluded-LINEMOD datasets. The experimental results demonstrate that our algorithm is robust to textureless materials and occlusion, while running more efficiently compared to other methods.","PeriodicalId":108312,"journal":{"name":"2021 IEEE International Conference on Robotics and Automation (ICRA)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"MFPN-6D : Real-time One-stage Pose Estimation of Objects on RGB Images\",\"authors\":\"Penglei Liu, Qieshi Zhang, Jin Zhang, Fei Wang, Jun Cheng\",\"doi\":\"10.1109/ICRA48506.2021.9561878\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"6D pose estimation of objects is an important part of robot grasping. The latest research trend on 6D pose estimation is to train a deep neural network to directly predict the 2D projection position of the 3D key points from the image, establish the corresponding relationship, and finally use Pespective-n-Point (PnP) algorithm performs pose estimation. The current challenge of pose estimation is that when the object texture-less, occluded and scene clutter, the detection accuracy will be reduced, and most of the existing algorithm models are large and cannot take the real-time requirements. In this paper, we introduce a Multi-directional Feature Pyramid Network, MFPN, which can efficiently integrate and utilize features. We combined the Cross Stage Partial Network (CSPNet) with MFPN to design a new network for 6D pose estimation, MFPN-6D. At the same time, we propose a new confidence calculation method for object pose estimation, which can fully consider spatial information and plane information. At last, we tested our method on the LINEMOD and Occluded-LINEMOD datasets. The experimental results demonstrate that our algorithm is robust to textureless materials and occlusion, while running more efficiently compared to other methods.\",\"PeriodicalId\":108312,\"journal\":{\"name\":\"2021 IEEE International Conference on Robotics and Automation (ICRA)\",\"volume\":\"64 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Robotics and Automation (ICRA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRA48506.2021.9561878\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA48506.2021.9561878","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

物体的姿态估计是机器人抓取的重要组成部分。6D姿态估计的最新研究趋势是训练深度神经网络直接从图像中预测三维关键点的二维投影位置，建立对应关系，最后使用PnP算法进行姿态估计。当前姿态估计面临的挑战是，当目标无纹理、遮挡和场景杂乱时，检测精度会降低，而且现有的大多数算法模型都很大，不能满足实时性要求。本文提出了一种能够有效整合和利用特征的多向特征金字塔网络(MFPN)。我们将交叉阶段部分网络(CSPNet)与MFPN相结合，设计了一种新的6D姿态估计网络MFPN-6D。同时，我们提出了一种新的目标姿态估计置信度计算方法，该方法可以充分考虑空间信息和平面信息。最后，我们在LINEMOD和occlded -LINEMOD数据集上测试了我们的方法。实验结果表明，该算法对无纹理材料和遮挡具有较强的鲁棒性，运行效率高于其他方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MFPN-6D : Real-time One-stage Pose Estimation of Objects on RGB Images

6D pose estimation of objects is an important part of robot grasping. The latest research trend on 6D pose estimation is to train a deep neural network to directly predict the 2D projection position of the 3D key points from the image, establish the corresponding relationship, and finally use Pespective-n-Point (PnP) algorithm performs pose estimation. The current challenge of pose estimation is that when the object texture-less, occluded and scene clutter, the detection accuracy will be reduced, and most of the existing algorithm models are large and cannot take the real-time requirements. In this paper, we introduce a Multi-directional Feature Pyramid Network, MFPN, which can efficiently integrate and utilize features. We combined the Cross Stage Partial Network (CSPNet) with MFPN to design a new network for 6D pose estimation, MFPN-6D. At the same time, we propose a new confidence calculation method for object pose estimation, which can fully consider spatial information and plane information. At last, we tested our method on the LINEMOD and Occluded-LINEMOD datasets. The experimental results demonstrate that our algorithm is robust to textureless materials and occlusion, while running more efficiently compared to other methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE International Conference on Robotics and Automation (ICRA)

自引率

0.00%

发文量