Object-RPE: Dense 3D Reconstruction and Pose Estimation with Convolutional Neural Networks for Warehouse Robots

Dinh-Cuong Hoang, Todor Stoyanov, A. Lilienthal
{"title":"Object-RPE: Dense 3D Reconstruction and Pose Estimation with Convolutional Neural Networks for Warehouse Robots","authors":"Dinh-Cuong Hoang, Todor Stoyanov, A. Lilienthal","doi":"10.1109/ECMR.2019.8870927","DOIUrl":null,"url":null,"abstract":"We present a system for accurate 3D instance-aware semantic reconstruction and 6D pose estimation, using an RGB-D camera. Our framework couples convolutional neural networks (CNNs) and a state-of-the-art dense Simultaneous Localisation and Mapping (SLAM) system, ElasticFusion, to achieve both high-quality semantic reconstruction as well as robust 6D pose estimation for relevant objects. The method presented in this paper extends a high-quality instance-aware semantic 3D Mapping system from previous work [1] by adding a 6D object pose estimator. While the main trend in CNN-based 6D pose estimation has been to infer object's position and orientation from single views of the scene, our approach explores performing pose estimation from multiple viewpoints, under the conjecture that combining multiple predictions can improve the robustness of an object detection system. The resulting system is capable of producing high-quality object-aware semantic reconstructions of room-sized environments, as well as accurately detecting objects and their 6D poses. The developed method has been verified through experimental validation on the YCB-Video dataset and a newly collected warehouse object dataset. Experimental results confirmed that the proposed system achieves improvements over state-of-the-art methods in terms of surface reconstruction and object pose prediction. Our code and video are available at https://sites.google.com/view/object-rpe.","PeriodicalId":435630,"journal":{"name":"2019 European Conference on Mobile Robots (ECMR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 European Conference on Mobile Robots (ECMR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ECMR.2019.8870927","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

We present a system for accurate 3D instance-aware semantic reconstruction and 6D pose estimation, using an RGB-D camera. Our framework couples convolutional neural networks (CNNs) and a state-of-the-art dense Simultaneous Localisation and Mapping (SLAM) system, ElasticFusion, to achieve both high-quality semantic reconstruction as well as robust 6D pose estimation for relevant objects. The method presented in this paper extends a high-quality instance-aware semantic 3D Mapping system from previous work [1] by adding a 6D object pose estimator. While the main trend in CNN-based 6D pose estimation has been to infer object's position and orientation from single views of the scene, our approach explores performing pose estimation from multiple viewpoints, under the conjecture that combining multiple predictions can improve the robustness of an object detection system. The resulting system is capable of producing high-quality object-aware semantic reconstructions of room-sized environments, as well as accurately detecting objects and their 6D poses. The developed method has been verified through experimental validation on the YCB-Video dataset and a newly collected warehouse object dataset. Experimental results confirmed that the proposed system achieves improvements over state-of-the-art methods in terms of surface reconstruction and object pose prediction. Our code and video are available at https://sites.google.com/view/object-rpe.
目标- rpe:基于卷积神经网络的仓库机器人密集三维重建和姿态估计
我们提出了一个使用RGB-D相机进行精确的3D实例感知语义重建和6D姿态估计的系统。我们的框架结合了卷积神经网络(cnn)和最先进的密集同步定位和映射(SLAM)系统ElasticFusion,以实现高质量的语义重建以及相关对象的鲁棒6D姿态估计。本文提出的方法通过增加一个6D对象姿态估计器,对先前工作[1]的高质量实例感知语义三维映射系统进行了扩展。虽然基于cnn的6D姿态估计的主要趋势是从场景的单个视图推断物体的位置和方向,但我们的方法探索了从多个视点进行姿态估计,并推测结合多个预测可以提高目标检测系统的鲁棒性。由此产生的系统能够对房间大小的环境产生高质量的物体感知语义重建,并准确检测物体及其6D姿势。在YCB-Video数据集和新采集的仓库对象数据集上进行了实验验证。实验结果证实,该系统在表面重建和目标姿态预测方面取得了较先进方法的改进。我们的代码和视频可以在https://sites.google.com/view/object-rpe上找到。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信