DPCN++:用于通用姿态配准的可微分相位相关网络

IF 20.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Zexi Chen, Yiyi Liao, Haozhe Du, Haodong Zhang, Xuecheng Xu, Haojian Lu, R. Xiong, Yue Wang
{"title":"DPCN++:用于通用姿态配准的可微分相位相关网络","authors":"Zexi Chen, Yiyi Liao, Haozhe Du, Haodong Zhang, Xuecheng Xu, Haojian Lu, R. Xiong, Yue Wang","doi":"10.48550/arXiv.2206.05707","DOIUrl":null,"url":null,"abstract":"Pose registration is critical in vision and robotics. This paper focuses on the challenging task of initialization-free pose registration up to 7DoF for homogeneous and heterogeneous measurements. While recent learning-based methods show promise using differentiable solvers, they either rely on heuristically defined correspondences or require initialization. Phase correlation seeks solutions in the spectral domain and is correspondence-free and initialization-free. Following this, we propose a differentiable solver and combine it with simple feature extraction networks, namely DPCN++. It can perform registration for homo/hetero inputs and generalizes well on unseen objects. Specifically, the feature extraction networks first learn dense feature grids from a pair of homogeneous/heterogeneous measurements. These feature grids are then transformed into a translation and scale invariant spectrum representation based on Fourier transform and spherical radial aggregation, decoupling translation and scale from rotation. Next, the rotation, scale, and translation are independently and efficiently estimated in the spectrum step-by-step. The entire pipeline is differentiable and trained end-to-end. We evaluate DCPN++ on a wide range of tasks taking different input modalities, including 2D bird's-eye view images, 3D object and scene measurements, and medical images. Experimental results demonstrate that DCPN++ outperforms both classical and learning-based baselines, especially on partially observed and heterogeneous measurements.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":" ","pages":""},"PeriodicalIF":20.8000,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DPCN++: Differentiable Phase Correlation Network for Versatile Pose Registration\",\"authors\":\"Zexi Chen, Yiyi Liao, Haozhe Du, Haodong Zhang, Xuecheng Xu, Haojian Lu, R. Xiong, Yue Wang\",\"doi\":\"10.48550/arXiv.2206.05707\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Pose registration is critical in vision and robotics. This paper focuses on the challenging task of initialization-free pose registration up to 7DoF for homogeneous and heterogeneous measurements. While recent learning-based methods show promise using differentiable solvers, they either rely on heuristically defined correspondences or require initialization. Phase correlation seeks solutions in the spectral domain and is correspondence-free and initialization-free. Following this, we propose a differentiable solver and combine it with simple feature extraction networks, namely DPCN++. It can perform registration for homo/hetero inputs and generalizes well on unseen objects. Specifically, the feature extraction networks first learn dense feature grids from a pair of homogeneous/heterogeneous measurements. These feature grids are then transformed into a translation and scale invariant spectrum representation based on Fourier transform and spherical radial aggregation, decoupling translation and scale from rotation. Next, the rotation, scale, and translation are independently and efficiently estimated in the spectrum step-by-step. The entire pipeline is differentiable and trained end-to-end. We evaluate DCPN++ on a wide range of tasks taking different input modalities, including 2D bird's-eye view images, 3D object and scene measurements, and medical images. Experimental results demonstrate that DCPN++ outperforms both classical and learning-based baselines, especially on partially observed and heterogeneous measurements.\",\"PeriodicalId\":13426,\"journal\":{\"name\":\"IEEE Transactions on Pattern Analysis and Machine Intelligence\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":20.8000,\"publicationDate\":\"2022-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Pattern Analysis and Machine Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2206.05707\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.48550/arXiv.2206.05707","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

姿势配准在视觉和机器人技术中至关重要。本文的重点是具有挑战性的任务,即针对同质和异质测量,无初始化姿态配准高达7DoF。虽然最近的基于学习的方法显示出使用可微分求解器的前景,但它们要么依赖于启发式定义的对应关系,要么需要初始化。相位相关在谱域中寻找解,并且是无对应和无初始化的。在此之后,我们提出了一种可微求解器,并将其与简单的特征提取网络相结合,即DPCN++。它可以对同源/异源输入进行配准,并对看不见的对象进行良好的泛化。具体而言,特征提取网络首先从一对同质/异质测量中学习密集特征网格。然后,基于傅立叶变换和球面径向聚合,将这些特征网格转换为平移和尺度不变的频谱表示,将平移和尺度与旋转解耦。接下来,在频谱中逐步独立有效地估计旋转、缩放和平移。整个管道是可微分的,并且是端到端训练的。我们在采用不同输入模式的广泛任务中评估DCPN++,包括2D鸟瞰图、3D对象和场景测量以及医学图像。实验结果表明,DCPN++的性能优于经典基线和基于学习的基线,尤其是在部分观测和异构测量方面。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
DPCN++: Differentiable Phase Correlation Network for Versatile Pose Registration
Pose registration is critical in vision and robotics. This paper focuses on the challenging task of initialization-free pose registration up to 7DoF for homogeneous and heterogeneous measurements. While recent learning-based methods show promise using differentiable solvers, they either rely on heuristically defined correspondences or require initialization. Phase correlation seeks solutions in the spectral domain and is correspondence-free and initialization-free. Following this, we propose a differentiable solver and combine it with simple feature extraction networks, namely DPCN++. It can perform registration for homo/hetero inputs and generalizes well on unseen objects. Specifically, the feature extraction networks first learn dense feature grids from a pair of homogeneous/heterogeneous measurements. These feature grids are then transformed into a translation and scale invariant spectrum representation based on Fourier transform and spherical radial aggregation, decoupling translation and scale from rotation. Next, the rotation, scale, and translation are independently and efficiently estimated in the spectrum step-by-step. The entire pipeline is differentiable and trained end-to-end. We evaluate DCPN++ on a wide range of tasks taking different input modalities, including 2D bird's-eye view images, 3D object and scene measurements, and medical images. Experimental results demonstrate that DCPN++ outperforms both classical and learning-based baselines, especially on partially observed and heterogeneous measurements.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
28.40
自引率
3.00%
发文量
885
审稿时长
8.5 months
期刊介绍: The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信