Multimodality Consistency for Point Cloud Completion via Differentiable Rendering

IEEE transactions on artificial intelligence Pub Date : 2025-01-10 DOI:10.1109/TAI.2025.3527922

Ben Fei;Yixuan Li;Weidong Yang;Wen-Ming Chen;Zhijun Li

{"title":"Multimodality Consistency for Point Cloud Completion via Differentiable Rendering","authors":"Ben Fei;Yixuan Li;Weidong Yang;Wen-Ming Chen;Zhijun Li","doi":"10.1109/TAI.2025.3527922","DOIUrl":null,"url":null,"abstract":"Point cloud completion aims to acquire complete and high-fidelity point clouds from partial and low-quality point clouds, which are used in remote sensing applications. Existing methods tend to solve this problem solely from the point cloud modality, limiting the completion process to only 3-D structure while overlooking the information from other modalities. Nevertheless, additional modalities possess valuable information that can greatly enhance the effectiveness of point cloud completion. The edge information in depth images can serve as a supervisory signal for ensuring accurate outlines and overall shape. To this end, we propose a brand-new point cloud completion network, dubbed multimodality differentiable rendering (<italic>MMDR</i>), which utilizes point-based differentiable rendering (DR) to obtain the depth images to ensure that the model preserves the point cloud structures from the depth image domain. Moreover, the attentional feature extractor (AFE) module is devised to exploit the global features inherent in the partial input, and the extracted global features together with the coordinates and features of the patch center are fed into the point roots predictor (PRP) module to obtain a set of point roots for the upsampling module with point upsampling Transformer (PU-Transformer). Furthermore, the multimodality consistency loss between the depth images from predicted point clouds and corresponding ground truth enables the PU-Transformer to generate a high-fidelity point cloud with predicted point agents. Extensive experiments conducted on various existing datasets give evidence that MMDR surpasses the off-the-shelf methods for point cloud completion after qualitative and quantitative analysis.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1746-1760"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10836747/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Point cloud completion aims to acquire complete and high-fidelity point clouds from partial and low-quality point clouds, which are used in remote sensing applications. Existing methods tend to solve this problem solely from the point cloud modality, limiting the completion process to only 3-D structure while overlooking the information from other modalities. Nevertheless, additional modalities possess valuable information that can greatly enhance the effectiveness of point cloud completion. The edge information in depth images can serve as a supervisory signal for ensuring accurate outlines and overall shape. To this end, we propose a brand-new point cloud completion network, dubbed multimodality differentiable rendering (MMDR), which utilizes point-based differentiable rendering (DR) to obtain the depth images to ensure that the model preserves the point cloud structures from the depth image domain. Moreover, the attentional feature extractor (AFE) module is devised to exploit the global features inherent in the partial input, and the extracted global features together with the coordinates and features of the patch center are fed into the point roots predictor (PRP) module to obtain a set of point roots for the upsampling module with point upsampling Transformer (PU-Transformer). Furthermore, the multimodality consistency loss between the depth images from predicted point clouds and corresponding ground truth enables the PU-Transformer to generate a high-fidelity point cloud with predicted point agents. Extensive experiments conducted on various existing datasets give evidence that MMDR surpasses the off-the-shelf methods for point cloud completion after qualitative and quantitative analysis.

查看原文本刊更多论文

基于可微分渲染的点云补全的多模态一致性

点云补全旨在从部分低质量点云中获取完整的高保真点云，用于遥感应用。现有的方法往往只从点云模态来解决这一问题，将补全过程限制在三维结构中，而忽略了其他模态的信息。然而，额外的模式拥有有价值的信息，可以大大提高点云补全的有效性。深度图像的边缘信息可以作为监督信号，保证轮廓和整体形状的准确。为此，我们提出了一种全新的点云补全网络，称为多模态可微渲染（MMDR），该网络利用基于点的可微渲染（DR）来获取深度图像，以确保模型保留深度图像域中的点云结构。此外，设计了注意力特征提取器（attention feature extractor， AFE）模块，利用局部输入中固有的全局特征，将提取的全局特征与patch中心的坐标和特征一起输入到点根预测器（PRP）模块中，为带有点上采样变压器（PU-Transformer）的上采样模块获得一组点根。此外，预测点云的深度图像与相应的地面真值之间的多模态一致性损失使得PU-Transformer能够使用预测的点代理生成高保真的点云。在各种现有数据集上进行的大量实验表明，经过定性和定量分析，MMDR优于现成的点云补全方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on artificial intelligence

CiteScore

7.70

自引率

0.00%

发文量