CMNet：基于patch的点云补全的跨模态粗到精网络

IF 11.1 1区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Circuits and Systems for Video Technology Pub Date : 2025-04-04 DOI:10.1109/TCSVT.2025.3557842

Zhenjiang Du;Zhitao Liu;Guan Wang;Jiwei Wei;Sophyani Banaamwini Yussif;Zheng Wang;Ning Xie;Yang Yang

{"title":"CMNet：基于patch的点云补全的跨模态粗到精网络","authors":"Zhenjiang Du;Zhitao Liu;Guan Wang;Jiwei Wei;Sophyani Banaamwini Yussif;Zheng Wang;Ning Xie;Yang Yang","doi":"10.1109/TCSVT.2025.3557842","DOIUrl":null,"url":null,"abstract":"Point clouds serve as the foundational representation of 3D objects, playing a pivotal role in both computer vision and computer graphics. Recently, the acquisition of point clouds has been effortless because of the development of hardware devices. However, the collected point clouds may be incomplete due to environmental conditions, such as occlusion. Therefore, completing partial point clouds becomes an essential task. The majority of current methods address point cloud completion via the utilization of shape priors. While these methods have demonstrated commendable performance, they often encounter challenges in preserving the global structural and geometric details of the 3D shape. In contrast to those mentioned earlier, we propose a novel cross-modal coarse-to-fine network (CMNet) for point cloud completion. Our method utilizes additional image information to provide global information, thus avoiding the loss of structure. To ensure that the generated results contain sufficient geometric details, we propose a coarse-to-fine learning approach based on multiple patches. Specifically, we encode the image and use multiple generators to generate multiple coarse patches, which are combined into a complete shape. Subsequently, based on the coarse patches generated in advance, we generate fine patches by combining partial point cloud information. Experimental results show that our method achieves state-of-the-art performance on point cloud completion.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 9","pages":"9132-9147"},"PeriodicalIF":11.1000,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CMNet: Cross-Modal Coarse-to-Fine Network for Point Cloud Completion Based on Patches\",\"authors\":\"Zhenjiang Du;Zhitao Liu;Guan Wang;Jiwei Wei;Sophyani Banaamwini Yussif;Zheng Wang;Ning Xie;Yang Yang\",\"doi\":\"10.1109/TCSVT.2025.3557842\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Point clouds serve as the foundational representation of 3D objects, playing a pivotal role in both computer vision and computer graphics. Recently, the acquisition of point clouds has been effortless because of the development of hardware devices. However, the collected point clouds may be incomplete due to environmental conditions, such as occlusion. Therefore, completing partial point clouds becomes an essential task. The majority of current methods address point cloud completion via the utilization of shape priors. While these methods have demonstrated commendable performance, they often encounter challenges in preserving the global structural and geometric details of the 3D shape. In contrast to those mentioned earlier, we propose a novel cross-modal coarse-to-fine network (CMNet) for point cloud completion. Our method utilizes additional image information to provide global information, thus avoiding the loss of structure. To ensure that the generated results contain sufficient geometric details, we propose a coarse-to-fine learning approach based on multiple patches. Specifically, we encode the image and use multiple generators to generate multiple coarse patches, which are combined into a complete shape. Subsequently, based on the coarse patches generated in advance, we generate fine patches by combining partial point cloud information. Experimental results show that our method achieves state-of-the-art performance on point cloud completion.\",\"PeriodicalId\":13082,\"journal\":{\"name\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"volume\":\"35 9\",\"pages\":\"9132-9147\"},\"PeriodicalIF\":11.1000,\"publicationDate\":\"2025-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10949193/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10949193/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

点云是三维物体的基本表征，在计算机视觉和计算机图形学中都起着举足轻重的作用。最近，由于硬件设备的发展，点云的获取变得毫不费力。然而，由于环境条件（如遮挡）的影响，收集到的点云可能是不完整的。因此，对局部点云进行补全成为一项必不可少的任务。目前的大多数方法通过利用形状先验来解决点云补全问题。虽然这些方法表现出了令人称赞的性能，但它们在保留三维形状的整体结构和几何细节方面经常遇到挑战。与前面提到的相比，我们提出了一种新的跨模态粗到精网络（CMNet）用于点云补全。我们的方法利用附加的图像信息来提供全局信息，从而避免了结构的丢失。为了确保生成的结果包含足够的几何细节，我们提出了一种基于多块的从粗到精的学习方法。具体来说，我们对图像进行编码，并使用多个生成器生成多个粗块，这些粗块组合成一个完整的形状。随后，在预先生成粗补丁的基础上，结合部分点云信息生成精细补丁。实验结果表明，该方法在点云补全方面取得了较好的效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

CMNet: Cross-Modal Coarse-to-Fine Network for Point Cloud Completion Based on Patches

Point clouds serve as the foundational representation of 3D objects, playing a pivotal role in both computer vision and computer graphics. Recently, the acquisition of point clouds has been effortless because of the development of hardware devices. However, the collected point clouds may be incomplete due to environmental conditions, such as occlusion. Therefore, completing partial point clouds becomes an essential task. The majority of current methods address point cloud completion via the utilization of shape priors. While these methods have demonstrated commendable performance, they often encounter challenges in preserving the global structural and geometric details of the 3D shape. In contrast to those mentioned earlier, we propose a novel cross-modal coarse-to-fine network (CMNet) for point cloud completion. Our method utilizes additional image information to provide global information, thus avoiding the loss of structure. To ensure that the generated results contain sufficient geometric details, we propose a coarse-to-fine learning approach based on multiple patches. Specifically, we encode the image and use multiple generators to generate multiple coarse patches, which are combined into a complete shape. Subsequently, based on the coarse patches generated in advance, we generate fine patches by combining partial point cloud information. Experimental results show that our method achieves state-of-the-art performance on point cloud completion.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Circuits and Systems for Video Technology 工程技术-工程：电子与电气

CiteScore

13.80

自引率

27.40%

发文量

660

审稿时长

5 months

期刊介绍： The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.