Cross-Domain and Cross-Modal Knowledge Distillation in Domain Adaptation for 3D Semantic Segmentation

Miaoyu Li, Yachao Zhang, Yuan Xie, Z. Gao, Cuihua Li, Zhizhong Zhang, Yanyun Qu
{"title":"Cross-Domain and Cross-Modal Knowledge Distillation in Domain Adaptation for 3D Semantic Segmentation","authors":"Miaoyu Li, Yachao Zhang, Yuan Xie, Z. Gao, Cuihua Li, Zhizhong Zhang, Yanyun Qu","doi":"10.1145/3503161.3547990","DOIUrl":null,"url":null,"abstract":"With the emergence of multi-modal datasets where LiDAR and camera are synchronized and calibrated, cross-modal Unsupervised Domain Adaptation (UDA) has attracted increasing attention because it reduces the laborious annotation of target domain samples. To alleviate the distribution gap between source and target domains, existing methods conduct feature alignment by using adversarial learning. However, it is well-known to be highly sensitive to hyperparameters and difficult to train. In this paper, we propose a novel model (Dual-Cross) that integrates Cross-Domain Knowledge Distillation (CDKD) and Cross-Modal Knowledge Distillation (CMKD) to mitigate domain shift. Specifically, we design the multi-modal style transfer to convert source image and point cloud to target style. With these synthetic samples as input, we introduce a target-aware teacher network to learn knowledge of the target domain. Then we present dual-cross knowledge distillation when the student is learning on source domain. CDKD constrains teacher and student predictions under same modality to be consistent. It can transfer target-aware knowledge from the teacher to the student, making the student more adaptive to the target domain. CMKD generates hybrid-modal prediction from the teacher predictions and constrains it to be consistent with both 2D and 3D student predictions. It promotes the information interaction between two modalities to make them complement each other. From the evaluation results on various domain adaptation settings, Dual-Cross significantly outperforms both uni-modal and cross-modal state-of-the-art methods.","PeriodicalId":412792,"journal":{"name":"Proceedings of the 30th ACM International Conference on Multimedia","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM International Conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3503161.3547990","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

With the emergence of multi-modal datasets where LiDAR and camera are synchronized and calibrated, cross-modal Unsupervised Domain Adaptation (UDA) has attracted increasing attention because it reduces the laborious annotation of target domain samples. To alleviate the distribution gap between source and target domains, existing methods conduct feature alignment by using adversarial learning. However, it is well-known to be highly sensitive to hyperparameters and difficult to train. In this paper, we propose a novel model (Dual-Cross) that integrates Cross-Domain Knowledge Distillation (CDKD) and Cross-Modal Knowledge Distillation (CMKD) to mitigate domain shift. Specifically, we design the multi-modal style transfer to convert source image and point cloud to target style. With these synthetic samples as input, we introduce a target-aware teacher network to learn knowledge of the target domain. Then we present dual-cross knowledge distillation when the student is learning on source domain. CDKD constrains teacher and student predictions under same modality to be consistent. It can transfer target-aware knowledge from the teacher to the student, making the student more adaptive to the target domain. CMKD generates hybrid-modal prediction from the teacher predictions and constrains it to be consistent with both 2D and 3D student predictions. It promotes the information interaction between two modalities to make them complement each other. From the evaluation results on various domain adaptation settings, Dual-Cross significantly outperforms both uni-modal and cross-modal state-of-the-art methods.
三维语义分割领域自适应中的跨领域、跨模态知识升华
随着激光雷达和相机同步和校准的多模态数据集的出现,跨模态无监督域自适应(UDA)因其减少了对目标域样本的费力标注而越来越受到关注。为了缓解源域和目标域之间的分布差距,现有方法通过对抗性学习进行特征对齐。然而,众所周知,它对超参数高度敏感,难以训练。本文提出了一种结合跨领域知识蒸馏(CDKD)和跨模态知识蒸馏(CMKD)的双交叉模型,以缓解领域转移。具体来说,我们设计了多模态样式转换,将源图像和点云转换为目标样式。以这些合成样本为输入,我们引入了一个目标感知教师网络来学习目标领域的知识。然后,我们提出了学生在源域学习时的双交叉知识蒸馏。CDKD限制了同一模态下教师和学生的预测是一致的。它可以将目标感知知识从教师传递给学生,使学生更能适应目标领域。CMKD从教师预测中生成混合模式预测,并约束其与2D和3D学生预测一致。它促进了两种模式之间的信息交互,使它们相互补充。从各种域自适应设置的评估结果来看,Dual-Cross显著优于单模态和跨模态的最先进方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信