{"title":"SamPose: Generalizable Model-Free 6D Object Pose Estimation via Single-View Prompt","authors":"Wubin Shi;Shaoyan Gai;Feipeng Da;Zeyu Cai","doi":"10.1109/LRA.2025.3550796","DOIUrl":null,"url":null,"abstract":"Object pose estimation in open-world scenarios is a critical challenge in robotics, virtual reality, and autonomous driving. In this letter, we introduce SamPose, a novel framework designed to achieve model-free 6DoF pose estimation of any target object in open-world environments using only a single-view prompt. SamPose consists mainly of an Open-world Object Detector (OOD) and a Coarse-to-Fine Pose Estimator (CFPE). The OOD utilizes a pre-trained EfficientSAM model to perform zero-shot segmentation matching tasks. It selects the proposals most similar to new objects based on matching scores derived from semantic, geometric, and local descriptors. In the CFPE phase, a sparse keypoint matcher, guided by DINO semantics, first performs robust keypoint matching and calculates an initial pose. Then, after aligning the perspectives from two views, a two-stage semi-dense keypoint matcher is used to compute reliable point correspondences and ultimately determine the object's pose. Finally, our extensive experiments demonstrate its robustness and competitive performance.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 5","pages":"4420-4427"},"PeriodicalIF":4.6000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10923719/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Object pose estimation in open-world scenarios is a critical challenge in robotics, virtual reality, and autonomous driving. In this letter, we introduce SamPose, a novel framework designed to achieve model-free 6DoF pose estimation of any target object in open-world environments using only a single-view prompt. SamPose consists mainly of an Open-world Object Detector (OOD) and a Coarse-to-Fine Pose Estimator (CFPE). The OOD utilizes a pre-trained EfficientSAM model to perform zero-shot segmentation matching tasks. It selects the proposals most similar to new objects based on matching scores derived from semantic, geometric, and local descriptors. In the CFPE phase, a sparse keypoint matcher, guided by DINO semantics, first performs robust keypoint matching and calculates an initial pose. Then, after aligning the perspectives from two views, a two-stage semi-dense keypoint matcher is used to compute reliable point correspondences and ultimately determine the object's pose. Finally, our extensive experiments demonstrate its robustness and competitive performance.
期刊介绍:
The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.