A Two-stage Cascading Method Based on Finetuning in Semi-supervised Domain Adaptation Semantic Segmentation

2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2022-11-07 DOI:10.23919/APSIPAASC55919.2022.9980206

Huiying Chang, Kaixin Chen, Ming Wu

{"title":"A Two-stage Cascading Method Based on Finetuning in Semi-supervised Domain Adaptation Semantic Segmentation","authors":"Huiying Chang, Kaixin Chen, Ming Wu","doi":"10.23919/APSIPAASC55919.2022.9980206","DOIUrl":null,"url":null,"abstract":"The traditional unsupervised domain adaptation (UDA) has achieved great success in many computer vision tasks, especially semantic segmentation, which requires high cost of pixel-wise annotations. However, the final performance of UDA method is still far behind that of supervised learning due to the lack of annotations. Researchers introduce the semi-supervised learning (SSL) and propose a more practical setting, semi-supervised domain adaptation (SSDA), that is, having labeled source domain data and a small number of labeled target domain data. To address the inter-domain gap, current researches focus on domain alignment by mixing annotated data from two domains, but we argue that adapting the target domain data distribution through model transfer is a better solution. In this paper, we propose a two-stage SSDA framework based on this assumption. Firstly, we adapt the model from the source domain to the labeled dataset in the target domain. To verify the assumption, we choose a basic transfer mode: finetuning. Then, to align the labeled subspace and the unlabeled subspace of the target domain, we choose teacher-student model with class-level data augmentation as the basis to realize online self-training. We also provide a deformation to solve overfitting on the target domain with a small number of annotated data. Extensive experiments on two synthetic-to-real benchmarks have demonstrated the correctness of our idea and the effectiveness of our method. In most SSDA scenarios, our approach can achieve supervised performance or even better.","PeriodicalId":382967,"journal":{"name":"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/APSIPAASC55919.2022.9980206","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The traditional unsupervised domain adaptation (UDA) has achieved great success in many computer vision tasks, especially semantic segmentation, which requires high cost of pixel-wise annotations. However, the final performance of UDA method is still far behind that of supervised learning due to the lack of annotations. Researchers introduce the semi-supervised learning (SSL) and propose a more practical setting, semi-supervised domain adaptation (SSDA), that is, having labeled source domain data and a small number of labeled target domain data. To address the inter-domain gap, current researches focus on domain alignment by mixing annotated data from two domains, but we argue that adapting the target domain data distribution through model transfer is a better solution. In this paper, we propose a two-stage SSDA framework based on this assumption. Firstly, we adapt the model from the source domain to the labeled dataset in the target domain. To verify the assumption, we choose a basic transfer mode: finetuning. Then, to align the labeled subspace and the unlabeled subspace of the target domain, we choose teacher-student model with class-level data augmentation as the basis to realize online self-training. We also provide a deformation to solve overfitting on the target domain with a small number of annotated data. Extensive experiments on two synthetic-to-real benchmarks have demonstrated the correctness of our idea and the effectiveness of our method. In most SSDA scenarios, our approach can achieve supervised performance or even better.

查看原文本刊更多论文

半监督域自适应语义分割中基于微调的两阶段级联方法

传统的无监督域自适应(UDA)在许多计算机视觉任务中取得了巨大的成功，特别是语义分割，这需要高成本的像素级标注。然而，由于缺乏标注，UDA方法的最终性能仍然远远落后于监督学习。研究人员引入了半监督学习(SSL)，并提出了一种更实用的设置，即半监督域自适应(SSDA)，即具有标记的源域数据和少量标记的目标域数据。为了解决领域间的差距，目前的研究主要集中在通过混合来自两个领域的注释数据来进行领域对齐，但我们认为通过模型转移来适应目标领域的数据分布是更好的解决方案。在本文中，我们基于这一假设提出了一个两阶段的SSDA框架。首先，我们将模型从源域调整到目标域的标记数据集。为了验证这个假设，我们选择了一种基本的传递模式:微调。然后，为了对齐目标域的标记子空间和未标记子空间，我们选择班级级数据增强的师生模型作为基础，实现在线自训练。我们还提供了一种变形来解决目标域上少量注释数据的过拟合问题。在两个合成到实际的基准上进行的大量实验证明了我们的思想的正确性和我们的方法的有效性。在大多数SSDA场景中，我们的方法可以实现监督性能甚至更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

自引率

0.00%

发文量