{"title":"Coarse-to-fine online latent representations matching for one-stage domain adaptive semantic segmentation","authors":"Zihao Dong , Sijie Niu , Xizhan Gao , Xiuli Shao","doi":"10.1016/j.patcog.2023.110019","DOIUrl":null,"url":null,"abstract":"<div><p><span><span>Domain adaptive semantic segmentation<span> is meaningful since collecting numerous labeled samples in different domains is expensive and time-consuming. Recent domain adaptation<span> methods yield not so efficient performance compared with supervised learning. With the hypothesis that semantic feature can be shared across domains, this paper proposes a coarse-to-fine online matching architecture (COM) for one-stage domain adaptation. We consider subsequent learning stages progressively refining the task in the latent feature space, i.e. the finer set at each component is hierarchically derived from the coarser set of the previous components, including cross-domain global prototypes, categories and instances matching and anchor-points </span></span></span>contrastive learning<span>, which further achieve self-supervised learning with region-level pseudo label generated only in a single training step. Beforehand, feature refinement are performed to realize edge perception and inter-feature augmentation. Then, coarse-to-fine network fuses global and local consistency matching via specific distribution alignment between the source and target domain. Finally, the adversarial structure controls the uncertainty of generator prediction through the maximization of classification results and minimization of two classifiers discrepancy. This proposed method is evaluated in two unsupervised domain adaptation tasks, i.e. GTA5 </span></span><span><math><mo>→</mo></math></span> Cityscapes and SYNTHIA <span><math><mo>→</mo></math></span> Cityscapes. Extensive experiments verify the effectiveness of our proposed COM model and demonstrate its superiority over several state-of-the-art approaches.</p></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"146 ","pages":"Article 110019"},"PeriodicalIF":7.6000,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320323007161","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Domain adaptive semantic segmentation is meaningful since collecting numerous labeled samples in different domains is expensive and time-consuming. Recent domain adaptation methods yield not so efficient performance compared with supervised learning. With the hypothesis that semantic feature can be shared across domains, this paper proposes a coarse-to-fine online matching architecture (COM) for one-stage domain adaptation. We consider subsequent learning stages progressively refining the task in the latent feature space, i.e. the finer set at each component is hierarchically derived from the coarser set of the previous components, including cross-domain global prototypes, categories and instances matching and anchor-points contrastive learning, which further achieve self-supervised learning with region-level pseudo label generated only in a single training step. Beforehand, feature refinement are performed to realize edge perception and inter-feature augmentation. Then, coarse-to-fine network fuses global and local consistency matching via specific distribution alignment between the source and target domain. Finally, the adversarial structure controls the uncertainty of generator prediction through the maximization of classification results and minimization of two classifiers discrepancy. This proposed method is evaluated in two unsupervised domain adaptation tasks, i.e. GTA5 Cityscapes and SYNTHIA Cityscapes. Extensive experiments verify the effectiveness of our proposed COM model and demonstrate its superiority over several state-of-the-art approaches.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.