Xin Jia , Jinglei Zhang , Lei Jia , Yunbo Wang , Shengyong Chen
{"title":"Rotation invariant dual-view 3D point cloud reconstruction with geometrical consistency based feature aggregation","authors":"Xin Jia , Jinglei Zhang , Lei Jia , Yunbo Wang , Shengyong Chen","doi":"10.1016/j.inffus.2025.103114","DOIUrl":null,"url":null,"abstract":"<div><div>Multi-view 3D reconstruction usually aggregates the features from object with different views for recovering 3D shape. We argue that exploring the rotation invariance of object region and further learning the geometrical consistency of regions across views enables better feature aggregation. However, existing methods fail to investigate this insight. Meanwhile, the intrinsic self-occlusion existed in input views would also compromise the consistency learning. This paper presents an approach termed Rotation invariant dual-view 3D point cloud reconstruction with Geometrical consistency based Feature aggregation (R3GF), reconstructing a 3D point cloud from two RGB images with arbitrary views. In encoding, a point cloud initialization network is assigned to initialize a rough point cloud for each view. To explore the rotation invariance of object region, a regional feature extraction network is proposed. It uses Euclidean distance and angle-based clues to capture rotation-invariant features that characterize geometrical information from different regions of rough point clouds. In decoding, to perform consistency learning even when self-occlusion existed in input views, a dual-stage cross attention mechanism is devised. It enhances the captured regional features by global shapes of rough point clouds, enriching the information of occluded regions. Then, the enhanced regional features from rough point clouds with different views are aligned to model the geometrical consistency among regions, achieving feature aggregation accurately. Furthermore, a point cloud refinement module is constructed to produce a refined point cloud using the aggregated feature. Extensive experiments on the ShapeNet and Pix3D datasets show that our R3GF outperforms the state-of-the-art methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103114"},"PeriodicalIF":14.7000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525001873","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-view 3D reconstruction usually aggregates the features from object with different views for recovering 3D shape. We argue that exploring the rotation invariance of object region and further learning the geometrical consistency of regions across views enables better feature aggregation. However, existing methods fail to investigate this insight. Meanwhile, the intrinsic self-occlusion existed in input views would also compromise the consistency learning. This paper presents an approach termed Rotation invariant dual-view 3D point cloud reconstruction with Geometrical consistency based Feature aggregation (R3GF), reconstructing a 3D point cloud from two RGB images with arbitrary views. In encoding, a point cloud initialization network is assigned to initialize a rough point cloud for each view. To explore the rotation invariance of object region, a regional feature extraction network is proposed. It uses Euclidean distance and angle-based clues to capture rotation-invariant features that characterize geometrical information from different regions of rough point clouds. In decoding, to perform consistency learning even when self-occlusion existed in input views, a dual-stage cross attention mechanism is devised. It enhances the captured regional features by global shapes of rough point clouds, enriching the information of occluded regions. Then, the enhanced regional features from rough point clouds with different views are aligned to model the geometrical consistency among regions, achieving feature aggregation accurately. Furthermore, a point cloud refinement module is constructed to produce a refined point cloud using the aggregated feature. Extensive experiments on the ShapeNet and Pix3D datasets show that our R3GF outperforms the state-of-the-art methods.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.