{"title":"CDAF:点云补全的跨模态和双通道上样自适应融合网络","authors":"Ming Lu , Jian Li , Duo Han Zhao, Qin Wang","doi":"10.1016/j.imavis.2025.105735","DOIUrl":null,"url":null,"abstract":"<div><div>In real-world scenarios, point cloud data often suffers from incompleteness due to limitations in sensor viewpoints, resolution constraints, and self-occlusions, which hinders its applications in domains such as autonomous driving and robotics. To address these challenges, this paper proposes a novel Cross-Modal and Dual-channel Upsample Adaptive Fusion network (CDAF), Our framework innovatively integrates depth maps with point clouds through dual-channel attention and gating units, significantly improving completion accuracy and detail recovery. The framework comprises two core modules: Cross-Modal Feature Enhancement (CMFE) and Dual-channel Upsampling Adaptive Fusion (DUAF). CMFE enhances point cloud feature representation by leveraging Spatial-activated Channel Attention to model channel-wise dependencies and Max-Sigmoid Attention to align cross-modal features between depth maps and point clouds, DUAF progressively refines coarse point clouds through a parallel structural analysis and similarity alignment branches, enabling adaptive fusion of local geometric priors and global shape consistency. Experimental results on multiple benchmark datasets demonstrate that CDAF surpasses existing state-of-the-art methods in point cloud completion tasks, showcasing superior global shape understanding and detail recovery.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"163 ","pages":"Article 105735"},"PeriodicalIF":4.2000,"publicationDate":"2025-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CDAF: Cross-Modal and Dual-channel Upsample Adaptive Fusion network for Point Cloud Completion\",\"authors\":\"Ming Lu , Jian Li , Duo Han Zhao, Qin Wang\",\"doi\":\"10.1016/j.imavis.2025.105735\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In real-world scenarios, point cloud data often suffers from incompleteness due to limitations in sensor viewpoints, resolution constraints, and self-occlusions, which hinders its applications in domains such as autonomous driving and robotics. To address these challenges, this paper proposes a novel Cross-Modal and Dual-channel Upsample Adaptive Fusion network (CDAF), Our framework innovatively integrates depth maps with point clouds through dual-channel attention and gating units, significantly improving completion accuracy and detail recovery. The framework comprises two core modules: Cross-Modal Feature Enhancement (CMFE) and Dual-channel Upsampling Adaptive Fusion (DUAF). CMFE enhances point cloud feature representation by leveraging Spatial-activated Channel Attention to model channel-wise dependencies and Max-Sigmoid Attention to align cross-modal features between depth maps and point clouds, DUAF progressively refines coarse point clouds through a parallel structural analysis and similarity alignment branches, enabling adaptive fusion of local geometric priors and global shape consistency. Experimental results on multiple benchmark datasets demonstrate that CDAF surpasses existing state-of-the-art methods in point cloud completion tasks, showcasing superior global shape understanding and detail recovery.</div></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":\"163 \",\"pages\":\"Article 105735\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0262885625003233\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625003233","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
CDAF: Cross-Modal and Dual-channel Upsample Adaptive Fusion network for Point Cloud Completion
In real-world scenarios, point cloud data often suffers from incompleteness due to limitations in sensor viewpoints, resolution constraints, and self-occlusions, which hinders its applications in domains such as autonomous driving and robotics. To address these challenges, this paper proposes a novel Cross-Modal and Dual-channel Upsample Adaptive Fusion network (CDAF), Our framework innovatively integrates depth maps with point clouds through dual-channel attention and gating units, significantly improving completion accuracy and detail recovery. The framework comprises two core modules: Cross-Modal Feature Enhancement (CMFE) and Dual-channel Upsampling Adaptive Fusion (DUAF). CMFE enhances point cloud feature representation by leveraging Spatial-activated Channel Attention to model channel-wise dependencies and Max-Sigmoid Attention to align cross-modal features between depth maps and point clouds, DUAF progressively refines coarse point clouds through a parallel structural analysis and similarity alignment branches, enabling adaptive fusion of local geometric priors and global shape consistency. Experimental results on multiple benchmark datasets demonstrate that CDAF surpasses existing state-of-the-art methods in point cloud completion tasks, showcasing superior global shape understanding and detail recovery.
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.