Xiating Jin;Jiajun Bu;Zhi Yu;Hui Zhang;Yaonan Wang
{"title":"雾景理解分散域自适应中的联邦幻觉翻译和无源正则化自适应","authors":"Xiating Jin;Jiajun Bu;Zhi Yu;Hui Zhang;Yaonan Wang","doi":"10.1109/TMM.2024.3521711","DOIUrl":null,"url":null,"abstract":"Semantic foggy scene understanding (SFSU) emerges a challenging task under out-of-domain distribution (OD) due to uncertain cognition caused by degraded visibility. With the strong assumption of data centralization, unsupervised domain adaptation (UDA) reduces vulnerability under OD scenario. Whereas, enlarged domain gap and growing privacy concern heavily challenge conventional UDA. Motivated by gap decomposition and data decentralization, we establish a decentralized domain adaptation (DDA) framework called <bold><u>T</u></b>ranslate th<bold><u>E</u></b>n <bold><u>A</u></b>dapt (abbr. <bold><u>TEA</u></b>) for privacy preservation. Our highlights lie in. (1) Regarding federated hallucination translation, a <bold><u>Dis</u></b>entanglement and <bold><u>Co</u></b>ntrastive-learning based <bold><u>G</u></b>enerative <bold><u>A</u></b>dversarial <bold><u>N</u></b>etwork (abbr. <bold><u>DisCoGAN</u></b>) is proposed to impose contrastive prior and disentangle latent space in cycle-consistent translation. To yield domain hallucination, client minimizes cross-entropy of local classifier but maximizes entropy of global model to train translator. (2) Regarding source-free regularization adaptation, a <bold><u>Pro</u></b>totypical-knowledge based <bold><u>R</u></b>egularization <bold><u>A</u></b>daptation (abbr. <bold><u>ProRA</u></b>) is presented to align joint distribution in output space. Soft adversarial learning relaxes binary label to rectify inter-domain discrepancy and inner-domain divergence. Structure clustering and entropy minimization drive intra-class features closer and inter-class features apart. Extensive experiments exhibit efficacy of our TEA which achieves 55.26% or 46.25% mIoU in adaptation from GTA5 to Foggy Cityscapes or Foggy Zurich, outperforming other DDA methods for SFSU.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"1601-1616"},"PeriodicalIF":8.4000,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Federated Hallucination Translation and Source-Free Regularization Adaptation in Decentralized Domain Adaptation for Foggy Scene Understanding\",\"authors\":\"Xiating Jin;Jiajun Bu;Zhi Yu;Hui Zhang;Yaonan Wang\",\"doi\":\"10.1109/TMM.2024.3521711\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semantic foggy scene understanding (SFSU) emerges a challenging task under out-of-domain distribution (OD) due to uncertain cognition caused by degraded visibility. With the strong assumption of data centralization, unsupervised domain adaptation (UDA) reduces vulnerability under OD scenario. Whereas, enlarged domain gap and growing privacy concern heavily challenge conventional UDA. Motivated by gap decomposition and data decentralization, we establish a decentralized domain adaptation (DDA) framework called <bold><u>T</u></b>ranslate th<bold><u>E</u></b>n <bold><u>A</u></b>dapt (abbr. <bold><u>TEA</u></b>) for privacy preservation. Our highlights lie in. (1) Regarding federated hallucination translation, a <bold><u>Dis</u></b>entanglement and <bold><u>Co</u></b>ntrastive-learning based <bold><u>G</u></b>enerative <bold><u>A</u></b>dversarial <bold><u>N</u></b>etwork (abbr. <bold><u>DisCoGAN</u></b>) is proposed to impose contrastive prior and disentangle latent space in cycle-consistent translation. To yield domain hallucination, client minimizes cross-entropy of local classifier but maximizes entropy of global model to train translator. (2) Regarding source-free regularization adaptation, a <bold><u>Pro</u></b>totypical-knowledge based <bold><u>R</u></b>egularization <bold><u>A</u></b>daptation (abbr. <bold><u>ProRA</u></b>) is presented to align joint distribution in output space. Soft adversarial learning relaxes binary label to rectify inter-domain discrepancy and inner-domain divergence. Structure clustering and entropy minimization drive intra-class features closer and inter-class features apart. Extensive experiments exhibit efficacy of our TEA which achieves 55.26% or 46.25% mIoU in adaptation from GTA5 to Foggy Cityscapes or Foggy Zurich, outperforming other DDA methods for SFSU.\",\"PeriodicalId\":13273,\"journal\":{\"name\":\"IEEE Transactions on Multimedia\",\"volume\":\"27 \",\"pages\":\"1601-1616\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2024-12-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multimedia\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10814654/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10814654/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
语义雾场景理解(SFSU)在域外分布(OD)条件下,由于能见度下降导致的认知不确定性而成为一项具有挑战性的任务。无监督域自适应(UDA)基于数据集中化的强假设,降低了OD场景下的脆弱性。然而,不断扩大的领域差距和日益增长的隐私问题对传统的UDA提出了严峻的挑战。在间隙分解和数据去中心化的驱动下,我们建立了一个名为Translate thEn Adapt(缩写TEA)的去中心化域适应(DDA)框架,用于隐私保护。我们的亮点在于。(1)针对联合幻觉翻译,提出了一种基于解纠缠和对比学习的生成对抗网络(Disentanglement and contrastional -learning,缩写为DisCoGAN),在循环一致翻译中施加对比先验和解纠缠潜在空间。为了产生领域幻觉,客户最小化局部分类器的交叉熵,而最大化全局模型的熵来训练翻译。(2)在无源正则化自适应方面,提出了一种基于原型知识的正则化自适应(proora)方法,对输出空间中的联合分布进行对齐。软对抗学习放宽二元标签,以纠正域间差异和域内分歧。结构聚类和熵最小化使得类内特征更接近,类间特征更远离。大量的实验表明,我们的TEA对GTA5雾蒙蒙城市景观或雾蒙蒙苏黎世的适应效率达到55.26%或46.25%,优于SFSU的其他DDA方法。
Federated Hallucination Translation and Source-Free Regularization Adaptation in Decentralized Domain Adaptation for Foggy Scene Understanding
Semantic foggy scene understanding (SFSU) emerges a challenging task under out-of-domain distribution (OD) due to uncertain cognition caused by degraded visibility. With the strong assumption of data centralization, unsupervised domain adaptation (UDA) reduces vulnerability under OD scenario. Whereas, enlarged domain gap and growing privacy concern heavily challenge conventional UDA. Motivated by gap decomposition and data decentralization, we establish a decentralized domain adaptation (DDA) framework called Translate thEn Adapt (abbr. TEA) for privacy preservation. Our highlights lie in. (1) Regarding federated hallucination translation, a Disentanglement and Contrastive-learning based Generative Adversarial Network (abbr. DisCoGAN) is proposed to impose contrastive prior and disentangle latent space in cycle-consistent translation. To yield domain hallucination, client minimizes cross-entropy of local classifier but maximizes entropy of global model to train translator. (2) Regarding source-free regularization adaptation, a Prototypical-knowledge based Regularization Adaptation (abbr. ProRA) is presented to align joint distribution in output space. Soft adversarial learning relaxes binary label to rectify inter-domain discrepancy and inner-domain divergence. Structure clustering and entropy minimization drive intra-class features closer and inter-class features apart. Extensive experiments exhibit efficacy of our TEA which achieves 55.26% or 46.25% mIoU in adaptation from GTA5 to Foggy Cityscapes or Foggy Zurich, outperforming other DDA methods for SFSU.
期刊介绍:
The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.