用于半监督语义分割的类概率空间正规化

IF 4.3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computer Vision and Image Understanding Pub Date : 2024-09-05 DOI:10.1016/j.cviu.2024.104146

Jianjian Yin , Shuai Yan , Tao Chen , Yi Chen , Yazhou Yao

{"title":"用于半监督语义分割的类概率空间正规化","authors":"Jianjian Yin , Shuai Yan , Tao Chen , Yi Chen , Yazhou Yao","doi":"10.1016/j.cviu.2024.104146","DOIUrl":null,"url":null,"abstract":"<div><p>Semantic segmentation achieves fine-grained scene parsing in any scenario, making it one of the key research directions to facilitate the development of human visual attention mechanisms. Recent advancements in semi-supervised semantic segmentation have attracted considerable attention due to their potential in leveraging unlabeled data. However, existing methods only focus on exploring the knowledge of unlabeled pixels with high certainty prediction. Their insufficient mining of low certainty regions of unlabeled data results in a significant loss of supervisory information. Therefore, this paper proposes the <strong>C</strong>lass <strong>P</strong>robability <strong>S</strong>pace <strong>R</strong>egularization (<strong>CPSR</strong>) approach to further exploit the potential of each unlabeled pixel. Specifically, we first design a class knowledge reshaping module to regularize the probability space of low certainty pixels, thereby transforming them into high certainty ones for supervised training. Furthermore, we propose a tail probability suppression module to suppress the probabilities of tailed classes, which facilitates the network to learn more discriminative information from the class probability space. Extensive experiments conducted on the PASCAL VOC2012 and Cityscapes datasets prove that our method achieves state-of-the-art performance without introducing much computational overhead. Code is available at <span><span>https://github.com/MKSAQW/CPSR</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"249 ","pages":"Article 104146"},"PeriodicalIF":4.3000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Class Probability Space Regularization for semi-supervised semantic segmentation\",\"authors\":\"Jianjian Yin , Shuai Yan , Tao Chen , Yi Chen , Yazhou Yao\",\"doi\":\"10.1016/j.cviu.2024.104146\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Semantic segmentation achieves fine-grained scene parsing in any scenario, making it one of the key research directions to facilitate the development of human visual attention mechanisms. Recent advancements in semi-supervised semantic segmentation have attracted considerable attention due to their potential in leveraging unlabeled data. However, existing methods only focus on exploring the knowledge of unlabeled pixels with high certainty prediction. Their insufficient mining of low certainty regions of unlabeled data results in a significant loss of supervisory information. Therefore, this paper proposes the <strong>C</strong>lass <strong>P</strong>robability <strong>S</strong>pace <strong>R</strong>egularization (<strong>CPSR</strong>) approach to further exploit the potential of each unlabeled pixel. Specifically, we first design a class knowledge reshaping module to regularize the probability space of low certainty pixels, thereby transforming them into high certainty ones for supervised training. Furthermore, we propose a tail probability suppression module to suppress the probabilities of tailed classes, which facilitates the network to learn more discriminative information from the class probability space. Extensive experiments conducted on the PASCAL VOC2012 and Cityscapes datasets prove that our method achieves state-of-the-art performance without introducing much computational overhead. Code is available at <span><span>https://github.com/MKSAQW/CPSR</span><svg><path></path></svg></span>.</p></div>\",\"PeriodicalId\":50633,\"journal\":{\"name\":\"Computer Vision and Image Understanding\",\"volume\":\"249 \",\"pages\":\"Article 104146\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Vision and Image Understanding\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1077314224002273\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314224002273","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

语义分割可在任何场景下实现细粒度场景解析，因此成为促进人类视觉注意力机制发展的关键研究方向之一。由于半监督语义分割在利用无标记数据方面的潜力，其最新进展引起了广泛关注。然而，现有的方法只专注于探索未标记像素的高确定性预测知识。这些方法对未标记数据的低确定性区域挖掘不足，导致监督信息的严重损失。因此，本文提出了类概率空间正则化（CPSR）方法，以进一步挖掘每个未标记像素的潜力。具体来说，我们首先设计了一个类知识重塑模块，对低确定性像素的概率空间进行正则化，从而将其转化为高确定性像素，用于监督训练。此外，我们还提出了尾部概率抑制模块，以抑制尾部类别的概率，从而促进网络从类别概率空间中学习更多的判别信息。在 PASCAL VOC2012 和 Cityscapes 数据集上进行的大量实验证明，我们的方法在不引入大量计算开销的情况下实现了最先进的性能。代码见 https://github.com/MKSAQW/CPSR。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Class Probability Space Regularization for semi-supervised semantic segmentation

Semantic segmentation achieves fine-grained scene parsing in any scenario, making it one of the key research directions to facilitate the development of human visual attention mechanisms. Recent advancements in semi-supervised semantic segmentation have attracted considerable attention due to their potential in leveraging unlabeled data. However, existing methods only focus on exploring the knowledge of unlabeled pixels with high certainty prediction. Their insufficient mining of low certainty regions of unlabeled data results in a significant loss of supervisory information. Therefore, this paper proposes the Class Probability Space Regularization (CPSR) approach to further exploit the potential of each unlabeled pixel. Specifically, we first design a class knowledge reshaping module to regularize the probability space of low certainty pixels, thereby transforming them into high certainty ones for supervised training. Furthermore, we propose a tail probability suppression module to suppress the probabilities of tailed classes, which facilitates the network to learn more discriminative information from the class probability space. Extensive experiments conducted on the PASCAL VOC2012 and Cityscapes datasets prove that our method achieves state-of-the-art performance without introducing much computational overhead. Code is available at https://github.com/MKSAQW/CPSR.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Vision and Image Understanding 工程技术-工程：电子与电气

CiteScore

7.80

自引率

4.40%

发文量

112

审稿时长

79 days

期刊介绍： The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views. Research Areas Include: • Theory • Early vision • Data structures and representations • Shape • Range • Motion • Matching and recognition • Architecture and languages • Vision systems