Jun Wu;Sicheng Li;Sihui Ji;Yifei Yang;Yue Wang;Rong Xiong;Yiyi Liao
{"title":"DORec:利用二维自监督特征进行分解对象重构和分割","authors":"Jun Wu;Sicheng Li;Sihui Ji;Yifei Yang;Yue Wang;Rong Xiong;Yiyi Liao","doi":"10.1109/LRA.2024.3511425","DOIUrl":null,"url":null,"abstract":"Recovering 3D geometry and textures of individual objects is crucial for many robotics applications, such as manipulation, pose estimation, and autonomous driving. However, decomposing a target object from a complex background is challenging. Most existing approaches rely on costly manual labels to acquire object instance perception. Recent advancements in 2D self-supervised learning offer new prospects for identifying objects of interest, yet leveraging such noisy 2D features for clean decomposition remains difficult. In this paper, we propose a Decomposed Object Reconstruction (DORec) network based on neural implicit representations. Our key idea is to use 2D self-supervised features to create two levels of masks for supervision: a binary mask for foreground regions and a K-cluster mask for semantically similar regions. These complementary masks result in robust decomposition. Experimental results on different datasets show DORec's superiority in segmenting and reconstructing diverse foreground objects from varied backgrounds enabling downstream tasks such as pose estimation.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 1","pages":"804-811"},"PeriodicalIF":4.6000,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DORec: Decomposed Object Reconstruction and Segmentation Utilizing 2D Self-Supervised Features\",\"authors\":\"Jun Wu;Sicheng Li;Sihui Ji;Yifei Yang;Yue Wang;Rong Xiong;Yiyi Liao\",\"doi\":\"10.1109/LRA.2024.3511425\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recovering 3D geometry and textures of individual objects is crucial for many robotics applications, such as manipulation, pose estimation, and autonomous driving. However, decomposing a target object from a complex background is challenging. Most existing approaches rely on costly manual labels to acquire object instance perception. Recent advancements in 2D self-supervised learning offer new prospects for identifying objects of interest, yet leveraging such noisy 2D features for clean decomposition remains difficult. In this paper, we propose a Decomposed Object Reconstruction (DORec) network based on neural implicit representations. Our key idea is to use 2D self-supervised features to create two levels of masks for supervision: a binary mask for foreground regions and a K-cluster mask for semantically similar regions. These complementary masks result in robust decomposition. Experimental results on different datasets show DORec's superiority in segmenting and reconstructing diverse foreground objects from varied backgrounds enabling downstream tasks such as pose estimation.\",\"PeriodicalId\":13241,\"journal\":{\"name\":\"IEEE Robotics and Automation Letters\",\"volume\":\"10 1\",\"pages\":\"804-811\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Robotics and Automation Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10777610/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10777610/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0
摘要
恢复单个物体的三维几何图形和纹理对许多机器人应用(如操纵、姿势估计和自动驾驶)至关重要。然而,从复杂的背景中分解目标物体是一项挑战。大多数现有方法都依赖于昂贵的人工标签来获取物体实例感知。最近在二维自监督学习方面取得的进展为识别感兴趣的物体提供了新的前景,但利用这种嘈杂的二维特征进行干净的分解仍然很困难。在本文中,我们提出了一种基于神经隐式表征的分解物体重构(DORec)网络。我们的主要想法是利用二维自监督特征创建两级监督掩码:前景区域的二进制掩码和语义相似区域的 K 簇掩码。这些互补的掩码可实现稳健的分解。在不同数据集上的实验结果表明,DORec 在从不同背景中分割和重建各种前景物体方面具有优势,可以完成姿态估计等下游任务。
DORec: Decomposed Object Reconstruction and Segmentation Utilizing 2D Self-Supervised Features
Recovering 3D geometry and textures of individual objects is crucial for many robotics applications, such as manipulation, pose estimation, and autonomous driving. However, decomposing a target object from a complex background is challenging. Most existing approaches rely on costly manual labels to acquire object instance perception. Recent advancements in 2D self-supervised learning offer new prospects for identifying objects of interest, yet leveraging such noisy 2D features for clean decomposition remains difficult. In this paper, we propose a Decomposed Object Reconstruction (DORec) network based on neural implicit representations. Our key idea is to use 2D self-supervised features to create two levels of masks for supervision: a binary mask for foreground regions and a K-cluster mask for semantically similar regions. These complementary masks result in robust decomposition. Experimental results on different datasets show DORec's superiority in segmenting and reconstructing diverse foreground objects from varied backgrounds enabling downstream tasks such as pose estimation.
期刊介绍:
The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.