DFW-PVNet：基于数据场加权的像素级投票网络，用于有效的6D姿态估计

IF 3.5 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Intelligence Pub Date : 2024-12-30 DOI:10.1007/s10489-024-05942-9

Yinning Lu, Songwei Pei

{"title":"DFW-PVNet：基于数据场加权的像素级投票网络，用于有效的6D姿态估计","authors":"Yinning Lu, Songwei Pei","doi":"10.1007/s10489-024-05942-9","DOIUrl":null,"url":null,"abstract":"<div><p>With the benefit of reduced memory and computational overhead, the sparse-based 6 degrees-of-freedom (6D) pose estimation method leverages the creation of sparse two-dimensional (2D) to three-dimensional (3D) correspondences to estimate the pose of objects in an RGB image. However, this method often leads to accuracy degradation. In this paper, we propose a data field weighting based pixel-wise voting network (DFW-PVNet), aiming at improving the accuracy of the 6D pose estimation while keeping excellent memory and computational overheads. The proposed DFW-PVNet first assigns potential weights to pixels at different positions by utilizing data field theory and then selects the pixels with higher potential weights to participate in the voting and locating of 2D keypoints. By building accurate sparse 2D-3D correspondences between the located 2D keypoints and the corresponding predefined 3D keypoints, the 6D pose of the object can be calculated through a perspective-n-point (PnP) solver. Experiments are conducted based on the LINEMOD and the Occlusion LINEMOD datasets, and the results show that the accuracy of the proposed method surpasses the state-of-the-art sparse-based methods and is comparable to dense-based methods but with significantly lower memory and computational overheads.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 3","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DFW-PVNet: data field weighting based pixel-wise voting network for effective 6D pose estimation\",\"authors\":\"Yinning Lu, Songwei Pei\",\"doi\":\"10.1007/s10489-024-05942-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>With the benefit of reduced memory and computational overhead, the sparse-based 6 degrees-of-freedom (6D) pose estimation method leverages the creation of sparse two-dimensional (2D) to three-dimensional (3D) correspondences to estimate the pose of objects in an RGB image. However, this method often leads to accuracy degradation. In this paper, we propose a data field weighting based pixel-wise voting network (DFW-PVNet), aiming at improving the accuracy of the 6D pose estimation while keeping excellent memory and computational overheads. The proposed DFW-PVNet first assigns potential weights to pixels at different positions by utilizing data field theory and then selects the pixels with higher potential weights to participate in the voting and locating of 2D keypoints. By building accurate sparse 2D-3D correspondences between the located 2D keypoints and the corresponding predefined 3D keypoints, the 6D pose of the object can be calculated through a perspective-n-point (PnP) solver. Experiments are conducted based on the LINEMOD and the Occlusion LINEMOD datasets, and the results show that the accuracy of the proposed method surpasses the state-of-the-art sparse-based methods and is comparable to dense-based methods but with significantly lower memory and computational overheads.</p></div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"55 3\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-12-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-024-05942-9\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-05942-9","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

基于稀疏的6自由度（6D）姿态估计方法利用创建稀疏的二维（2D）到三维（3D）对应来估计RGB图像中物体的姿态，从而减少了内存和计算开销。然而，这种方法往往导致精度下降。在本文中，我们提出了一种基于数据场加权的像素级投票网络（DFW-PVNet），旨在提高6D姿态估计的准确性，同时保持良好的内存和计算开销。提出的DFW-PVNet首先利用数据场理论为不同位置的像素点分配潜在权值，然后选择潜在权值较高的像素点参与二维关键点的投票和定位。通过在定位的2D关键点与相应的预定义3D关键点之间建立精确的稀疏2D-3D对应关系，可以通过视角-n点（PnP）求解器计算物体的6D位姿。基于LINEMOD和Occlusion LINEMOD数据集进行了实验，结果表明，该方法的精度超过了目前基于稀疏的方法，与基于密集的方法相当，但内存和计算开销明显降低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

DFW-PVNet: data field weighting based pixel-wise voting network for effective 6D pose estimation

查看原文本刊更多论文

DFW-PVNet: data field weighting based pixel-wise voting network for effective 6D pose estimation

With the benefit of reduced memory and computational overhead, the sparse-based 6 degrees-of-freedom (6D) pose estimation method leverages the creation of sparse two-dimensional (2D) to three-dimensional (3D) correspondences to estimate the pose of objects in an RGB image. However, this method often leads to accuracy degradation. In this paper, we propose a data field weighting based pixel-wise voting network (DFW-PVNet), aiming at improving the accuracy of the 6D pose estimation while keeping excellent memory and computational overheads. The proposed DFW-PVNet first assigns potential weights to pixels at different positions by utilizing data field theory and then selects the pixels with higher potential weights to participate in the voting and locating of 2D keypoints. By building accurate sparse 2D-3D correspondences between the located 2D keypoints and the corresponding predefined 3D keypoints, the 6D pose of the object can be calculated through a perspective-n-point (PnP) solver. Experiments are conducted based on the LINEMOD and the Occlusion LINEMOD datasets, and the results show that the accuracy of the proposed method surpasses the state-of-the-art sparse-based methods and is comparable to dense-based methods but with significantly lower memory and computational overheads.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Intelligence 工程技术-计算机：人工智能

CiteScore

6.60

自引率

20.80%

发文量

1361

审稿时长

5.9 months

期刊介绍： With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance. The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.