On the Importance of Label Encoding and Uncertainty Estimation for Robotic Grasp Detection

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Pub Date : 2022-10-23 DOI:10.1109/IROS47612.2022.9981866

Benedict Stephan, Dustin Aganian, Lars Hinneburg, M. Eisenbach, Steffen Müller, H. Groß

{"title":"On the Importance of Label Encoding and Uncertainty Estimation for Robotic Grasp Detection","authors":"Benedict Stephan, Dustin Aganian, Lars Hinneburg, M. Eisenbach, Steffen Müller, H. Groß","doi":"10.1109/IROS47612.2022.9981866","DOIUrl":null,"url":null,"abstract":"Automated grasping of arbitrary objects is an essential skill for many applications such as smart manufacturing and human robot interaction. This makes grasp detection a vital skill for automated robotic systems. Recent work in model-free grasp detection uses point cloud data as input and typically outperforms the earlier work on RGB(D)-based methods. We show that RGB(D)-based methods are being underestimated due to suboptimal label encodings used for training. Using the evaluation pipeline of the GraspNet-1Billion dataset, we investigate different encodings and propose a novel encoding that significantly improves grasp detection on depth images. Additionally, we show shortcomings of the 2D rectangle grasps supplied by the GraspNet-1Billion dataset and propose a filtering scheme by which the ground truth labels can be improved significantly. Furthermore, we apply established methods for uncertainty estimation on our trained models since knowing when we can trust the model's decisions provides an advantage for real-world application. By doing so, we are the first to directly estimate uncertainties of detected grasps. We also investigate the applicability of the estimated aleatoric and epistemic uncertainties based on their theoretical properties. Additionally, we demonstrate the correlation between estimated uncertainties and grasp quality, thus improving selection of high quality grasp detections. By all these modifications, our approach using only depth images can compete with point-cloud-based approaches for grasp detection despite the lower degree of freedom for grasp poses in 2D image space.","PeriodicalId":431373,"journal":{"name":"2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IROS47612.2022.9981866","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Automated grasping of arbitrary objects is an essential skill for many applications such as smart manufacturing and human robot interaction. This makes grasp detection a vital skill for automated robotic systems. Recent work in model-free grasp detection uses point cloud data as input and typically outperforms the earlier work on RGB(D)-based methods. We show that RGB(D)-based methods are being underestimated due to suboptimal label encodings used for training. Using the evaluation pipeline of the GraspNet-1Billion dataset, we investigate different encodings and propose a novel encoding that significantly improves grasp detection on depth images. Additionally, we show shortcomings of the 2D rectangle grasps supplied by the GraspNet-1Billion dataset and propose a filtering scheme by which the ground truth labels can be improved significantly. Furthermore, we apply established methods for uncertainty estimation on our trained models since knowing when we can trust the model's decisions provides an advantage for real-world application. By doing so, we are the first to directly estimate uncertainties of detected grasps. We also investigate the applicability of the estimated aleatoric and epistemic uncertainties based on their theoretical properties. Additionally, we demonstrate the correlation between estimated uncertainties and grasp quality, thus improving selection of high quality grasp detections. By all these modifications, our approach using only depth images can compete with point-cloud-based approaches for grasp detection despite the lower degree of freedom for grasp poses in 2D image space.

查看原文本刊更多论文

标签编码和不确定性估计在机器人抓取检测中的重要性

任意物体的自动抓取是智能制造和人机交互等许多应用的基本技能。这使得抓握检测成为自动化机器人系统的一项重要技能。最近在无模型抓取检测方面的工作使用点云数据作为输入，通常优于早期基于RGB(D)方法的工作。我们表明，由于用于训练的次优标签编码，基于RGB(D)的方法被低估了。利用graspnet - 10亿数据集的评估管道，我们研究了不同的编码方式，并提出了一种新的编码方式，可以显著提高深度图像的抓握检测。此外，我们指出了graspnet - 10亿数据集提供的2D矩形抓取的缺点，并提出了一种过滤方案，通过该方案可以显着改进地面真值标签。此外，我们将已建立的方法应用于训练模型的不确定性估计，因为知道何时可以信任模型的决策为现实世界的应用提供了优势。通过这样做，我们是第一个直接估计检测到的抓点的不确定性。我们还根据其理论性质研究了估计的任意不确定性和认知不确定性的适用性。此外，我们证明了估计不确定度与抓握质量之间的相关性，从而改进了高质量抓握检测的选择。通过所有这些修改，我们的方法仅使用深度图像可以与基于点云的方法竞争抓取检测，尽管在二维图像空间中抓取姿势的自由度较低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

自引率

0.00%

发文量