{"title":"MCNet: A multi-level consistency network for 3D point cloud self-supervised learning","authors":"Hongshuo Liu , Jing Bai , Gan Lin","doi":"10.1016/j.inffus.2025.103410","DOIUrl":null,"url":null,"abstract":"<div><div>With the rapid advancement of computer vision and artificial intelligence, point clouds have become a pivotal representation of 3D data. However, practical applications of point clouds are hindered by challenges such as noise, structural sparsity, and information occlusion, which complicate computations and degrade the performance of high-precision analyses. While self-supervised learning has proven effective in reducing reliance on annotated data, existing methods predominantly focus on local features, often neglecting global structures and the balance between geometric and semantic information. This paper introduces the Multi-level Consistency Network (MCNet), a novel framework designed to comprehensively explore multi-level feature information in point clouds. MCNet integrates geometric, structural, and high-order semantic supervisory signals, fostering alignment and complementarity of features through self-supervised learning. We propose the Global–Local Synergistic Noise Module (GLSNM), which combines Principal Component Analysis-based Non-Local Noise Addition (PCA-NLNA) with Mask-based Local Noise Injection (Mask-LNI) to balance the preservation of global structures and local details. Additionally, we develop the Multi-level Information Reconstruction Module (MIRM), which employs an attention fusion mechanism to dynamically balance geometric and high-order semantic information, thereby enhancing the model’s feature extraction capabilities in complex environments. Extensive experiments demonstrate that MCNet consistently outperforms existing methods across multiple tasks, including meta-classification, few-shot classification, real-world scene classification, fine-grained classification, and segmentation. These results validate the effectiveness of MCNet and its significant contribution to the field of point cloud processing.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103410"},"PeriodicalIF":14.7000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S156625352500483X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
With the rapid advancement of computer vision and artificial intelligence, point clouds have become a pivotal representation of 3D data. However, practical applications of point clouds are hindered by challenges such as noise, structural sparsity, and information occlusion, which complicate computations and degrade the performance of high-precision analyses. While self-supervised learning has proven effective in reducing reliance on annotated data, existing methods predominantly focus on local features, often neglecting global structures and the balance between geometric and semantic information. This paper introduces the Multi-level Consistency Network (MCNet), a novel framework designed to comprehensively explore multi-level feature information in point clouds. MCNet integrates geometric, structural, and high-order semantic supervisory signals, fostering alignment and complementarity of features through self-supervised learning. We propose the Global–Local Synergistic Noise Module (GLSNM), which combines Principal Component Analysis-based Non-Local Noise Addition (PCA-NLNA) with Mask-based Local Noise Injection (Mask-LNI) to balance the preservation of global structures and local details. Additionally, we develop the Multi-level Information Reconstruction Module (MIRM), which employs an attention fusion mechanism to dynamically balance geometric and high-order semantic information, thereby enhancing the model’s feature extraction capabilities in complex environments. Extensive experiments demonstrate that MCNet consistently outperforms existing methods across multiple tasks, including meta-classification, few-shot classification, real-world scene classification, fine-grained classification, and segmentation. These results validate the effectiveness of MCNet and its significant contribution to the field of point cloud processing.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.