Fei Wang , Jia Wu , Rui Ma , Yisha Liu , Zengshuai Qiu
{"title":"A cross-modal Siamese representation learning network for point cloud understanding","authors":"Fei Wang , Jia Wu , Rui Ma , Yisha Liu , Zengshuai Qiu","doi":"10.1016/j.compeleceng.2025.110426","DOIUrl":null,"url":null,"abstract":"<div><div>Learning effective representations from unannotated point cloud data is a challenging task in self-supervised learning. Recently, methods that use point clouds and images for cross-modal learning have achieved impressive performance. However, these methods still have some shortcomings in exploring the latent information between these two modalities. To address this issue, we propose a cross-modal Siamese representation learning network called CrossSiamese. This network uses point clouds and their rendered images for cross-modal contrastive learning. We introduce an intra-modal prediction mechanism in the network to capture the internal information in the point cloud and image modalities. In addition, we introduce a cross-modal cross-prediction mechanism to capture mutual information between the two modalities. Experimental results show that our method improves the accuracy of linear classification for 3D objects by 0.4% on ModelNet40 and 1.7% on ScanObjectNN compared to existing baseline methods. Additionally, experiments on few-shot object classification and 3D object part segmentation further validate the effectiveness of our method. These results indicate that the representations learned by our method have generalization ability and can be effectively transferred to these three downstream tasks.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"126 ","pages":"Article 110426"},"PeriodicalIF":4.0000,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625003696","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Learning effective representations from unannotated point cloud data is a challenging task in self-supervised learning. Recently, methods that use point clouds and images for cross-modal learning have achieved impressive performance. However, these methods still have some shortcomings in exploring the latent information between these two modalities. To address this issue, we propose a cross-modal Siamese representation learning network called CrossSiamese. This network uses point clouds and their rendered images for cross-modal contrastive learning. We introduce an intra-modal prediction mechanism in the network to capture the internal information in the point cloud and image modalities. In addition, we introduce a cross-modal cross-prediction mechanism to capture mutual information between the two modalities. Experimental results show that our method improves the accuracy of linear classification for 3D objects by 0.4% on ModelNet40 and 1.7% on ScanObjectNN compared to existing baseline methods. Additionally, experiments on few-shot object classification and 3D object part segmentation further validate the effectiveness of our method. These results indicate that the representations learned by our method have generalization ability and can be effectively transferred to these three downstream tasks.
期刊介绍:
The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency.
Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.