利用密集深度的二维图像融合进行三维语义分割的对抗性无监督领域自适应

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum Pub Date : 2024-10-24 DOI:10.1111/cgf.15250

Xindan Zhang, Ying Li, Huankun Sheng, Xinnian Zhang

{"title":"利用密集深度的二维图像融合进行三维语义分割的对抗性无监督领域自适应","authors":"Xindan Zhang, Ying Li, Huankun Sheng, Xinnian Zhang","doi":"10.1111/cgf.15250","DOIUrl":null,"url":null,"abstract":"<p>Unsupervised domain adaptation (UDA) is increasingly used for 3D point cloud semantic segmentation tasks due to its ability to address the issue of missing labels for new domains. However, most existing unsupervised domain adaptation methods focus only on uni-modal data and are rarely applied to multi-modal data. Therefore, we propose a cross-modal UDA on multi-modal datasets that contain 3D point clouds and 2D images for 3D Semantic Segmentation. Specifically, we first propose a Dual discriminator-based Domain Adaptation (Dd-bDA) module to enhance the adaptability of different domains. Second, given that the robustness of depth information to domain shifts can provide more details for semantic segmentation, we further employ a Dense depth Feature Fusion (DdFF) module to extract image features with rich depth cues. We evaluate our model in four unsupervised domain adaptation scenarios, i.e., dataset-to-dataset (A2D2 → SemanticKITTI), Day-to-Night, country-to-country (USA → Singapore), and synthetic-to-real (VirtualKITTI → SemanticKITTI). In all settings, the experimental results achieve significant improvements and surpass state-of-the-art models.</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adversarial Unsupervised Domain Adaptation for 3D Semantic Segmentation with 2D Image Fusion of Dense Depth\",\"authors\":\"Xindan Zhang, Ying Li, Huankun Sheng, Xinnian Zhang\",\"doi\":\"10.1111/cgf.15250\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Unsupervised domain adaptation (UDA) is increasingly used for 3D point cloud semantic segmentation tasks due to its ability to address the issue of missing labels for new domains. However, most existing unsupervised domain adaptation methods focus only on uni-modal data and are rarely applied to multi-modal data. Therefore, we propose a cross-modal UDA on multi-modal datasets that contain 3D point clouds and 2D images for 3D Semantic Segmentation. Specifically, we first propose a Dual discriminator-based Domain Adaptation (Dd-bDA) module to enhance the adaptability of different domains. Second, given that the robustness of depth information to domain shifts can provide more details for semantic segmentation, we further employ a Dense depth Feature Fusion (DdFF) module to extract image features with rich depth cues. We evaluate our model in four unsupervised domain adaptation scenarios, i.e., dataset-to-dataset (A2D2 → SemanticKITTI), Day-to-Night, country-to-country (USA → Singapore), and synthetic-to-real (VirtualKITTI → SemanticKITTI). In all settings, the experimental results achieve significant improvements and surpass state-of-the-art models.</p>\",\"PeriodicalId\":10687,\"journal\":{\"name\":\"Computer Graphics Forum\",\"volume\":\"43 7\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Graphics Forum\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/cgf.15250\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics Forum","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cgf.15250","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

无监督领域适应（UDA）由于能够解决新领域标签缺失的问题，越来越多地被用于三维点云语义分割任务。然而，现有的大多数无监督域适应方法只关注单模态数据，很少应用于多模态数据。因此，我们在包含三维点云和二维图像的多模态数据集上提出了一种用于三维语义分割的跨模态 UDA。具体来说，我们首先提出了基于双判别器的领域适应（Dd-bDA）模块，以增强不同领域的适应性。其次，鉴于深度信息对域偏移的鲁棒性可以为语义分割提供更多细节，我们进一步采用了密集深度特征融合（DdFF）模块来提取具有丰富深度线索的图像特征。我们在四种无监督领域适应场景中评估了我们的模型，即数据集到数据集（A2D2 → SemanticKITTI）、白天到黑夜、国家到国家（美国 → 新加坡）以及合成到真实（VirtualKITTI → SemanticKITTI）。在所有设置中，实验结果都取得了显著的改进，并超越了最先进的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Adversarial Unsupervised Domain Adaptation for 3D Semantic Segmentation with 2D Image Fusion of Dense Depth

Unsupervised domain adaptation (UDA) is increasingly used for 3D point cloud semantic segmentation tasks due to its ability to address the issue of missing labels for new domains. However, most existing unsupervised domain adaptation methods focus only on uni-modal data and are rarely applied to multi-modal data. Therefore, we propose a cross-modal UDA on multi-modal datasets that contain 3D point clouds and 2D images for 3D Semantic Segmentation. Specifically, we first propose a Dual discriminator-based Domain Adaptation (Dd-bDA) module to enhance the adaptability of different domains. Second, given that the robustness of depth information to domain shifts can provide more details for semantic segmentation, we further employ a Dense depth Feature Fusion (DdFF) module to extract image features with rich depth cues. We evaluate our model in four unsupervised domain adaptation scenarios, i.e., dataset-to-dataset (A2D2 → SemanticKITTI), Day-to-Night, country-to-country (USA → Singapore), and synthetic-to-real (VirtualKITTI → SemanticKITTI). In all settings, the experimental results achieve significant improvements and surpass state-of-the-art models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Graphics Forum 工程技术-计算机：软件工程

CiteScore

5.80

自引率

12.00%

发文量

175

审稿时长

3-6 weeks

期刊介绍： Computer Graphics Forum is the official journal of Eurographics, published in cooperation with Wiley-Blackwell, and is a unique, international source of information for computer graphics professionals interested in graphics developments worldwide. It is now one of the leading journals for researchers, developers and users of computer graphics in both commercial and academic environments. The journal reports on the latest developments in the field throughout the world and covers all aspects of the theory, practice and application of computer graphics.