Coarse-to-fine online latent representations matching for one-stage domain adaptive semantic segmentation

IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Zihao Dong , Sijie Niu , Xizhan Gao , Xiuli Shao
{"title":"Coarse-to-fine online latent representations matching for one-stage domain adaptive semantic segmentation","authors":"Zihao Dong ,&nbsp;Sijie Niu ,&nbsp;Xizhan Gao ,&nbsp;Xiuli Shao","doi":"10.1016/j.patcog.2023.110019","DOIUrl":null,"url":null,"abstract":"<div><p><span><span>Domain adaptive semantic segmentation<span> is meaningful since collecting numerous labeled samples in different domains is expensive and time-consuming. Recent domain adaptation<span> methods yield not so efficient performance compared with supervised learning. With the hypothesis that semantic feature can be shared across domains, this paper proposes a coarse-to-fine online matching architecture (COM) for one-stage domain adaptation. We consider subsequent learning stages progressively refining the task in the latent feature space, i.e. the finer set at each component is hierarchically derived from the coarser set of the previous components, including cross-domain global prototypes, categories and instances matching and anchor-points </span></span></span>contrastive learning<span>, which further achieve self-supervised learning with region-level pseudo label generated only in a single training step. Beforehand, feature refinement are performed to realize edge perception and inter-feature augmentation. Then, coarse-to-fine network fuses global and local consistency matching via specific distribution alignment between the source and target domain. Finally, the adversarial structure controls the uncertainty of generator prediction through the maximization of classification results and minimization of two classifiers discrepancy. This proposed method is evaluated in two unsupervised domain adaptation tasks, i.e. GTA5 </span></span><span><math><mo>→</mo></math></span> Cityscapes and SYNTHIA <span><math><mo>→</mo></math></span> Cityscapes. Extensive experiments verify the effectiveness of our proposed COM model and demonstrate its superiority over several state-of-the-art approaches.</p></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"146 ","pages":"Article 110019"},"PeriodicalIF":7.6000,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320323007161","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Domain adaptive semantic segmentation is meaningful since collecting numerous labeled samples in different domains is expensive and time-consuming. Recent domain adaptation methods yield not so efficient performance compared with supervised learning. With the hypothesis that semantic feature can be shared across domains, this paper proposes a coarse-to-fine online matching architecture (COM) for one-stage domain adaptation. We consider subsequent learning stages progressively refining the task in the latent feature space, i.e. the finer set at each component is hierarchically derived from the coarser set of the previous components, including cross-domain global prototypes, categories and instances matching and anchor-points contrastive learning, which further achieve self-supervised learning with region-level pseudo label generated only in a single training step. Beforehand, feature refinement are performed to realize edge perception and inter-feature augmentation. Then, coarse-to-fine network fuses global and local consistency matching via specific distribution alignment between the source and target domain. Finally, the adversarial structure controls the uncertainty of generator prediction through the maximization of classification results and minimization of two classifiers discrepancy. This proposed method is evaluated in two unsupervised domain adaptation tasks, i.e. GTA5 Cityscapes and SYNTHIA Cityscapes. Extensive experiments verify the effectiveness of our proposed COM model and demonstrate its superiority over several state-of-the-art approaches.

一阶段领域自适应语义分割的粗到精在线潜在表示匹配
领域自适应语义分割具有重要的意义,因为在不同的领域中收集大量的标记样本成本高且耗时长。与有监督学习相比,目前的领域自适应方法的性能并不理想。在假设语义特征可以跨领域共享的前提下,提出了一种用于一阶段领域自适应的从粗到精的在线匹配架构(COM)。我们考虑后续的学习阶段在潜在特征空间中逐步细化任务,即每个组件的更细集是由前一个组件的粗集分层导出的,包括跨域全局原型、类别和实例匹配以及锚点对比学习,从而进一步实现仅在单个训练步骤中生成区域级伪标签的自监督学习。在此之前,进行特征细化,实现边缘感知和特征间增强。然后,通过源域和目标域之间的特定分布对齐,实现从粗到细的网络融合全局和局部一致性匹配。最后,对抗性结构通过分类结果的最大化和两个分类器差异的最小化来控制生成器预测的不确定性。在GTA5→cityscape和SYNTHIA→cityscape两个无监督域自适应任务中对该方法进行了评估。大量的实验验证了我们提出的COM模型的有效性,并证明了它优于几种最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Pattern Recognition
Pattern Recognition 工程技术-工程:电子与电气
CiteScore
14.40
自引率
16.20%
发文量
683
审稿时长
5.6 months
期刊介绍: The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信