ReINView: Re-interpreting Views for Multi-view 3D Object Recognition

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Pub Date : 2022-10-23 DOI:10.1109/IROS47612.2022.9981777

Ruchang Xu, Wei Ma, Qing Mi, H. Zha

{"title":"ReINView: Re-interpreting Views for Multi-view 3D Object Recognition","authors":"Ruchang Xu, Wei Ma, Qing Mi, H. Zha","doi":"10.1109/IROS47612.2022.9981777","DOIUrl":null,"url":null,"abstract":"Multi-view-based 3D object recognition is important in robot-environment interaction. However, recent methods simply extract features from each view via convolutional neural networks (CNNs) and then fuse these features together to make predictions. These methods ignore the inherent ambiguities of each view caused due to 3D-2D projection. To address this problem, we propose a novel deep framework for multi-view-based 3D object recognition. Instead of fusing the multi-view features directly, we design a re-interpretation module (ReINView) to eliminate the ambiguities at each view. To achieve this, ReINView re-interprets view features patch by patch by using their context from nearby views, considering that local patches are generally co-visible at nearby viewpoints. Since contour shapes are essential for 3D object recognition as well, ReINView further performs view-level re-interpretation, in which we use all the views as context sources since the target contours to be re-interpreted are globally observable. The re-interpreted multi-view features can better reflect the 3D global and local structures of the object. Experiments on both ModelNet40 and ModelNet10 show that the proposed model outperforms state-of-the-art methods in 3D object recognition.","PeriodicalId":431373,"journal":{"name":"2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IROS47612.2022.9981777","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Multi-view-based 3D object recognition is important in robot-environment interaction. However, recent methods simply extract features from each view via convolutional neural networks (CNNs) and then fuse these features together to make predictions. These methods ignore the inherent ambiguities of each view caused due to 3D-2D projection. To address this problem, we propose a novel deep framework for multi-view-based 3D object recognition. Instead of fusing the multi-view features directly, we design a re-interpretation module (ReINView) to eliminate the ambiguities at each view. To achieve this, ReINView re-interprets view features patch by patch by using their context from nearby views, considering that local patches are generally co-visible at nearby viewpoints. Since contour shapes are essential for 3D object recognition as well, ReINView further performs view-level re-interpretation, in which we use all the views as context sources since the target contours to be re-interpreted are globally observable. The re-interpreted multi-view features can better reflect the 3D global and local structures of the object. Experiments on both ModelNet40 and ModelNet10 show that the proposed model outperforms state-of-the-art methods in 3D object recognition.

查看原文本刊更多论文

ReINView:重新解释多视图3D对象识别的视图

基于多视角的三维物体识别在机器人与环境交互中具有重要意义。然而，最近的方法只是通过卷积神经网络(cnn)从每个视图中提取特征，然后将这些特征融合在一起进行预测。这些方法忽略了由于3D-2D投影导致的每个视图的固有模糊性。为了解决这个问题，我们提出了一种新的基于多视图的三维物体识别深度框架。我们设计了一个重新解释模块(ReINView)来消除每个视图上的歧义，而不是直接融合多视图特征。为了实现这一点，考虑到局部斑块通常在附近的视点共同可见，ReINView通过使用附近视图的上下文来逐块重新解释视图特征。由于轮廓形状对于3D物体识别也是必不可少的，因此ReINView进一步执行视图级重新解释，其中我们使用所有视图作为上下文源，因为要重新解释的目标轮廓是全局可观察的。重新解释的多视图特征可以更好地反映物体的三维全局和局部结构。在ModelNet40和ModelNet10上的实验表明，所提出的模型在3D物体识别方面优于目前最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

自引率

0.00%

发文量