Semi-supervised Semantic Segmentation with Directional Context-aware Consistency

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2021-06-01 DOI:10.1109/CVPR46437.2021.00126

Xin Lai, Zhuotao Tian, Li Jiang, Shu Liu, Hengshuang Zhao, Liwei Wang, Jiaya Jia

{"title":"Semi-supervised Semantic Segmentation with Directional Context-aware Consistency","authors":"Xin Lai, Zhuotao Tian, Li Jiang, Shu Liu, Hengshuang Zhao, Liwei Wang, Jiaya Jia","doi":"10.1109/CVPR46437.2021.00126","DOIUrl":null,"url":null,"abstract":"Semantic segmentation has made tremendous progress in recent years. However, satisfying performance highly depends on a large number of pixel-level annotations. Therefore, in this paper, we focus on the semi-supervised segmentation problem where only a small set of labeled data is provided with a much larger collection of totally unlabeled images. Nevertheless, due to the limited annotations, models may overly rely on the contexts available in the training data, which causes poor generalization to the scenes un-seen before. A preferred high-level representation should capture the contextual information while not losing self-awareness. Therefore, we propose to maintain the context-aware consistency between features of the same identity but with different contexts, making the representations robust to the varying environments. Moreover, we present the Directional Contrastive Loss (DC Loss) to accomplish the consistency in a pixel-to-pixel manner, only requiring the feature with lower quality to be aligned towards its counterpart. In addition, to avoid the false-negative samples and filter the uncertain positive samples, we put forward two sampling strategies. Extensive experiments show that our simple yet effective method surpasses current state-of-the-art methods by a large margin and also generalizes well with extra image-level annotations.","PeriodicalId":339646,"journal":{"name":"2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"119","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR46437.2021.00126","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 119

Abstract

Semantic segmentation has made tremendous progress in recent years. However, satisfying performance highly depends on a large number of pixel-level annotations. Therefore, in this paper, we focus on the semi-supervised segmentation problem where only a small set of labeled data is provided with a much larger collection of totally unlabeled images. Nevertheless, due to the limited annotations, models may overly rely on the contexts available in the training data, which causes poor generalization to the scenes un-seen before. A preferred high-level representation should capture the contextual information while not losing self-awareness. Therefore, we propose to maintain the context-aware consistency between features of the same identity but with different contexts, making the representations robust to the varying environments. Moreover, we present the Directional Contrastive Loss (DC Loss) to accomplish the consistency in a pixel-to-pixel manner, only requiring the feature with lower quality to be aligned towards its counterpart. In addition, to avoid the false-negative samples and filter the uncertain positive samples, we put forward two sampling strategies. Extensive experiments show that our simple yet effective method surpasses current state-of-the-art methods by a large margin and also generalizes well with extra image-level annotations.

查看原文本刊更多论文

具有方向上下文感知一致性的半监督语义分割

语义分割近年来取得了巨大的进展。然而，令人满意的性能高度依赖于大量的像素级注释。因此，在本文中，我们专注于半监督分割问题，其中只有一小部分标记数据提供了更大的完全未标记图像集合。然而，由于有限的注释，模型可能会过度依赖训练数据中可用的上下文，从而导致对以前未见过的场景的泛化能力差。首选的高级表示应该在不失去自我意识的情况下捕获上下文信息。因此，我们建议在具有相同身份但具有不同上下文的特征之间保持上下文感知一致性，使表征对不同环境具有鲁棒性。此外，我们提出了定向对比损耗(DC Loss)来实现像素到像素的一致性，只需要将质量较低的特征与对应的特征对齐。此外，为了避免假阴性样本和过滤不确定阳性样本，我们提出了两种采样策略。大量的实验表明，我们的方法简单而有效，大大超过了目前最先进的方法，并且通过额外的图像级注释也能很好地泛化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量