{"title":"Semantic Segmentation and 6DoF Pose Estimation using RGB-D Images and Deep Neural Networks","authors":"V. Tran, Huei-Yung Lin","doi":"10.1109/isie45552.2021.9576248","DOIUrl":null,"url":null,"abstract":"Recently, 6DoF object pose estimation for manipulation robots is an essential task in robotic and industrial applications. While deep learning methods have gained significant object detection and semantic segmentation development, the 6DoF pose estimation task is still challenging. Specifically, it is used with visual sensors to provide a robotic manipulator's information to interact with the target objects. Thus, 6DoF pose estimation and object recognition from point clouds or RGB-D images are essential tasks for visual servoing. This paper proposes a learning-based method for estimating 6DoF object pose for manipulation robots in industrial settings. A deep convolutional neural network (CNN) for semantic segmentation on RGB images is proposed. The target object area is determined by the network, which is then combined with depth knowledge to perform 6DoF object pose estimation using the ICP algorithm. With mIOU, we built our own dataset for training and assessment. As compared to other approaches that use a limited amount of training data, our proposed approach will provide better performance. For the robotic gripping application, we used an HIWIN 6-axis robot with an Asus Xtion Live 3D camera to test and validate our solution. We show the robotic grasping application using this method, which accurately estimates 6DoF object poses and has a high success rate in robotic grasping.","PeriodicalId":365956,"journal":{"name":"2021 IEEE 30th International Symposium on Industrial Electronics (ISIE)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 30th International Symposium on Industrial Electronics (ISIE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/isie45552.2021.9576248","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, 6DoF object pose estimation for manipulation robots is an essential task in robotic and industrial applications. While deep learning methods have gained significant object detection and semantic segmentation development, the 6DoF pose estimation task is still challenging. Specifically, it is used with visual sensors to provide a robotic manipulator's information to interact with the target objects. Thus, 6DoF pose estimation and object recognition from point clouds or RGB-D images are essential tasks for visual servoing. This paper proposes a learning-based method for estimating 6DoF object pose for manipulation robots in industrial settings. A deep convolutional neural network (CNN) for semantic segmentation on RGB images is proposed. The target object area is determined by the network, which is then combined with depth knowledge to perform 6DoF object pose estimation using the ICP algorithm. With mIOU, we built our own dataset for training and assessment. As compared to other approaches that use a limited amount of training data, our proposed approach will provide better performance. For the robotic gripping application, we used an HIWIN 6-axis robot with an Asus Xtion Live 3D camera to test and validate our solution. We show the robotic grasping application using this method, which accurately estimates 6DoF object poses and has a high success rate in robotic grasping.
近年来,操作机器人的六自由度目标位姿估计是机器人和工业应用中的一项重要任务。虽然深度学习方法在目标检测和语义分割方面取得了重大进展,但6DoF姿态估计任务仍然具有挑战性。具体来说,它与视觉传感器一起用于提供机器人操纵器与目标物体交互的信息。因此,基于点云或RGB-D图像的6DoF姿态估计和目标识别是视觉伺服的基本任务。提出了一种基于学习的工业操作机器人六自由度物体姿态估计方法。提出一种用于RGB图像语义分割的深度卷积神经网络(CNN)。目标物体区域由网络确定,然后结合深度知识,使用ICP算法进行6DoF目标姿态估计。通过mIOU,我们建立了自己的训练和评估数据集。与使用有限训练数据量的其他方法相比,我们提出的方法将提供更好的性能。对于机器人抓取应用,我们使用了一个带有华硕Xtion Live 3D相机的HIWIN 6轴机器人来测试和验证我们的解决方案。在机器人抓取应用中,该方法能够准确估计物体的6DoF姿态,抓取成功率高。