SE-UF-PVNet: A Structure Enhanced Pixel-wise Union vector Fields Voting Network for 6DoF Pose Estimation

Yuqing Huang, Kefeng Wu, Fujun Sun, ChaoQuan Cai
{"title":"SE-UF-PVNet: A Structure Enhanced Pixel-wise Union vector Fields Voting Network for 6DoF Pose Estimation","authors":"Yuqing Huang, Kefeng Wu, Fujun Sun, ChaoQuan Cai","doi":"10.1145/3603781.3603859","DOIUrl":null,"url":null,"abstract":"This paper focuses on addressing the problem of 6DoF object pose estimation with a known 3D model from a single RGB image. Some recent works have shown that structure information is effective for 6DoF pose estimation but they do not make full use. We propose SE-UF-PVNet, a more explicit, flexible, and powerful framework to introduce structure information. We construct a keypoint graph in the object coordinate system, introduce a Graph Convolution Network module to extract structure features from the keypoint graph, and concatenate them with features extracted from RGB images by the keypoints regressing network at pixel-wise. To make the estimation more robust, we predict direction vector fields and distance vector fields concurrently, propose a modified pixel-wise voting based keypoint localization algorithm on distance vector fields and further propose an algorithm based on union vector fields. Additionally, we add an Atrous Spatial Pyramid Pooling module to enhance the multi-scale feature sensing capability. Experiment results show that our method achieves 91.88 average ADD (-S) accuracy on Linemod dataset, which is the best among existing pixel-wise voting based methods. Similarly, our method achieves 49.01 average ADD (-S) accuracy on Occlusion Linemod dataset, which is the state-of-the-art among all compared methods without pose refinement.","PeriodicalId":391180,"journal":{"name":"Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things","volume":"72 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3603781.3603859","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper focuses on addressing the problem of 6DoF object pose estimation with a known 3D model from a single RGB image. Some recent works have shown that structure information is effective for 6DoF pose estimation but they do not make full use. We propose SE-UF-PVNet, a more explicit, flexible, and powerful framework to introduce structure information. We construct a keypoint graph in the object coordinate system, introduce a Graph Convolution Network module to extract structure features from the keypoint graph, and concatenate them with features extracted from RGB images by the keypoints regressing network at pixel-wise. To make the estimation more robust, we predict direction vector fields and distance vector fields concurrently, propose a modified pixel-wise voting based keypoint localization algorithm on distance vector fields and further propose an algorithm based on union vector fields. Additionally, we add an Atrous Spatial Pyramid Pooling module to enhance the multi-scale feature sensing capability. Experiment results show that our method achieves 91.88 average ADD (-S) accuracy on Linemod dataset, which is the best among existing pixel-wise voting based methods. Similarly, our method achieves 49.01 average ADD (-S) accuracy on Occlusion Linemod dataset, which is the state-of-the-art among all compared methods without pose refinement.
SE-UF-PVNet:用于6DoF姿态估计的结构增强逐像素联合向量场投票网络
本文主要研究了利用已知的RGB图像的三维模型进行6DoF物体姿态估计的问题。近年来的一些研究表明,结构信息对六自由度姿态估计是有效的,但没有得到充分利用。我们提出了一个更加明确、灵活和强大的框架SE-UF-PVNet来引入结构信息。我们在目标坐标系中构造关键点图,引入图卷积网络模块从关键点图中提取结构特征,并将其与关键点回归网络从RGB图像中提取的特征按像素级进行拼接。为了提高估计的鲁棒性,我们同时预测方向向量场和距离向量场,提出了一种改进的基于像素投票的距离向量场关键点定位算法,并进一步提出了一种基于联合向量场的关键点定位算法。此外,我们增加了一个空间金字塔池模块,以提高多尺度特征感知能力。实验结果表明,该方法在Linemod数据集上的平均ADD (-S)准确率为91.88,是现有基于像素的投票方法中准确率最高的。同样,我们的方法在Occlusion Linemod数据集上达到了49.01的平均ADD (-S)精度,在所有没有姿态优化的比较方法中是最先进的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信