Dual-COPE: A novel prior-based category-level object pose estimation network with dual Sim2Real unsupervised domain adaptation module

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk Pub Date : 2024-08-14 DOI:10.1016/j.cag.2024.104045

Xi Ren , Nan Guo , Zichen Zhu , Xinbei Jiang

{"title":"Dual-COPE: A novel prior-based category-level object pose estimation network with dual Sim2Real unsupervised domain adaptation module","authors":"Xi Ren , Nan Guo , Zichen Zhu , Xinbei Jiang","doi":"10.1016/j.cag.2024.104045","DOIUrl":null,"url":null,"abstract":"<div><p>Category-level pose estimation offers the generalization ability to novel objects unseen during training, which has attracted increasing attention in recent years. Despite the advantage, annotating real-world data with pose label is intricate and laborious. Although using synthetic data with free annotations can greatly reduce training costs, the Synthetic-to-Real (Sim2Real) domain gap could result in a sharp performance decline on real-world test. In this paper, we propose Dual-COPE, a novel prior-based category-level object pose estimation method with dual Sim2Real domain adaptation to avoid expensive real pose annotations. First, we propose an estimation network featured with conjoined prior deformation and transformer-based matching to realize high-precision pose prediction. Upon that, an efficient dual Sim2Real domain adaptation module is further designed to reduce the feature distribution discrepancy between synthetic and real-world data both semantically and geometrically, thus maintaining superior performance on real-world test. Moreover, the adaptation module is loosely coupled with estimation network, allowing for easy integration with other methods without any additional inference overhead. Comprehensive experiments show that Dual-COPE outperforms existing unsupervised methods and achieves state-of-the-art precision under supervised settings.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104045"},"PeriodicalIF":2.5000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849324001808","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Category-level pose estimation offers the generalization ability to novel objects unseen during training, which has attracted increasing attention in recent years. Despite the advantage, annotating real-world data with pose label is intricate and laborious. Although using synthetic data with free annotations can greatly reduce training costs, the Synthetic-to-Real (Sim2Real) domain gap could result in a sharp performance decline on real-world test. In this paper, we propose Dual-COPE, a novel prior-based category-level object pose estimation method with dual Sim2Real domain adaptation to avoid expensive real pose annotations. First, we propose an estimation network featured with conjoined prior deformation and transformer-based matching to realize high-precision pose prediction. Upon that, an efficient dual Sim2Real domain adaptation module is further designed to reduce the feature distribution discrepancy between synthetic and real-world data both semantically and geometrically, thus maintaining superior performance on real-world test. Moreover, the adaptation module is loosely coupled with estimation network, allowing for easy integration with other methods without any additional inference overhead. Comprehensive experiments show that Dual-COPE outperforms existing unsupervised methods and achieves state-of-the-art precision under supervised settings.

Abstract Image

查看原文本刊更多论文

Dual-COPE：带有双 Sim2Real 无监督领域适应模块的基于先验的新式类别级物体姿态估计网络

类别级姿态估计具有泛化能力，可以泛化到训练过程中未见的新物体，这在近年来引起了越来越多的关注。尽管有这样的优势，但在真实世界数据中标注姿势标签是一项复杂而费力的工作。虽然使用带有免费注释的合成数据可以大大降低训练成本，但合成到真实（Sim2Real）领域的差距可能会导致在真实世界测试中的性能急剧下降。在本文中，我们提出了一种新颖的基于先验的类别级物体姿态估计方法--Dual-COPE，该方法具有双 Sim2Real 域适应性，可避免昂贵的真实姿态注释。首先，我们提出了一种以先验变形和基于变换器的匹配相结合为特征的估计网络，以实现高精度姿态预测。在此基础上，我们进一步设计了高效的双 Sim2Real 域适配模块，以减少合成数据与真实世界数据在语义和几何上的特征分布差异，从而在真实世界测试中保持优异的性能。此外，适应模块与估算网络松散耦合，可与其他方法轻松集成，而无需任何额外的推理开销。综合实验表明，Dual-COPE 优于现有的无监督方法，并在有监督设置下达到了最先进的精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers & Graphics-Uk 工程技术-计算机：软件工程

CiteScore

5.30

自引率

12.00%

发文量

173

审稿时长

38 days

期刊介绍： Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on: 1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains. 2. State-of-the-art papers on late-breaking, cutting-edge research on CG. 3. Information on innovative uses of graphics principles and technologies. 4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.