CACP: Covariance-Aware Cross-Domain Prototypes for Domain Adaptive Semantic Segmentation

IF 9.7 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Multimedia Pub Date : 2025-02-21 DOI:10.1109/TMM.2025.3543016

Yanbing Xue;Xinyu Tian;Feifei Zhang;Xianbin Wen;Zan Gao;Shengyong Chen

{"title":"CACP: Covariance-Aware Cross-Domain Prototypes for Domain Adaptive Semantic Segmentation","authors":"Yanbing Xue;Xinyu Tian;Feifei Zhang;Xianbin Wen;Zan Gao;Shengyong Chen","doi":"10.1109/TMM.2025.3543016","DOIUrl":null,"url":null,"abstract":"Domain adaptive semantic segmentation aims to reduce domain shifts / discrepancies between source and target domains, improving the source domain model's generalization ability to the target domain. Recently, prototypical methods, which primarily use single-source or single-target domain prototypes as category centers to aggregate features from both domains, have achieved competitive performance in this task. However, due to large domain shifts, single-source domain prototypes have finite generalization ability and not all source domain knowledge is conducive to model generalization. Single-target domain prototypes are noisy because they are prematurely initialized with all features filtered by pseudo labels, which causes error accumulation in the prototypes. To address these issues, we propose a covariance-aware cross-domain prototypes method (CACP) to achieve robust domain adaptation. We propose to use both domain prototypes to dynamically rectify pseudo labels in the target domain, effectively reducing the recognition difficulty of hard target domain samples and narrowing the gap between features of the same category in both domains. In addition, to further generalize the model to the target domain, we propose two modules based on covariance correlation, FSPC (Features Selection by Prototypes Covariances) and WSPC (Weighting Source by Prototypes Coefficients), to learn discriminative characteristics. FSPC selects highly correlated features to update target domain prototypes online, denoising and enhancing discriminativeness between categories. WSPC utilizes the correlation coefficients between target domain prototypes and source domain features to weight each point in the source domain, eliminating the information interference from the source domain. In particular, CACP achieves excellent performance on the GTA5 <inline-formula><tex-math>$\\to$</tex-math></inline-formula> Cityscapes and SYNTHIA <inline-formula><tex-math>$\\to$</tex-math></inline-formula> Cityscapes tasks with minimal computational resources and time.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"5023-5034"},"PeriodicalIF":9.7000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10897886/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Domain adaptive semantic segmentation aims to reduce domain shifts / discrepancies between source and target domains, improving the source domain model's generalization ability to the target domain. Recently, prototypical methods, which primarily use single-source or single-target domain prototypes as category centers to aggregate features from both domains, have achieved competitive performance in this task. However, due to large domain shifts, single-source domain prototypes have finite generalization ability and not all source domain knowledge is conducive to model generalization. Single-target domain prototypes are noisy because they are prematurely initialized with all features filtered by pseudo labels, which causes error accumulation in the prototypes. To address these issues, we propose a covariance-aware cross-domain prototypes method (CACP) to achieve robust domain adaptation. We propose to use both domain prototypes to dynamically rectify pseudo labels in the target domain, effectively reducing the recognition difficulty of hard target domain samples and narrowing the gap between features of the same category in both domains. In addition, to further generalize the model to the target domain, we propose two modules based on covariance correlation, FSPC (Features Selection by Prototypes Covariances) and WSPC (Weighting Source by Prototypes Coefficients), to learn discriminative characteristics. FSPC selects highly correlated features to update target domain prototypes online, denoising and enhancing discriminativeness between categories. WSPC utilizes the correlation coefficients between target domain prototypes and source domain features to weight each point in the source domain, eliminating the information interference from the source domain. In particular, CACP achieves excellent performance on the GTA5

$\to$

Cityscapes and SYNTHIA

$\to$

Cityscapes tasks with minimal computational resources and time.

查看原文本刊更多论文

领域自适应语义分割的协方差感知跨领域原型

领域自适应语义分割旨在减少源域和目标域之间的领域偏移或差异，提高源域模型对目标域的泛化能力。近年来，原型方法主要使用单源或单目标领域原型作为类别中心来聚合两个领域的特征，在这一任务中取得了较好的表现。然而，由于领域漂移较大，单源领域原型泛化能力有限，并不是所有的源领域知识都有利于模型泛化。单目标域原型是有噪声的，因为它们是用伪标签过滤的所有特征过早初始化的，这会导致原型中的错误积累。为了解决这些问题，我们提出了一种协方差感知的跨域原型方法（CACP）来实现鲁棒的域自适应。我们提出使用两个领域原型在目标领域动态校正伪标签，有效降低了硬目标领域样本的识别难度，缩小了两个领域中相同类别的特征之间的差距。此外，为了进一步将模型推广到目标域，我们提出了基于协方差相关的两个模块FSPC（通过原型协方差选择特征）和WSPC（通过原型系数加权源）来学习判别特征。FSPC选择高度相关的特征在线更新目标域原型，去噪并增强类别之间的判别性。WSPC利用目标域原型与源域特征之间的相关系数对源域的每个点进行加权，消除了源域的信息干扰。特别是，CACP在GTA5 $\to$ cityscape和SYNTHIA $\to$ cityscape任务上以最小的计算资源和时间实现了出色的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Multimedia 工程技术-电信学

CiteScore

11.70

自引率

11.00%

发文量

576

审稿时长

5.5 months

期刊介绍： The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.