Diversify and Conquer: Open-Set Disagreement for Robust Semi-Supervised Learning With Outliers

IF 8.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Heejo Kong;Sung-Jin Kim;Gunho Jung;Seong-Whan Lee
{"title":"Diversify and Conquer: Open-Set Disagreement for Robust Semi-Supervised Learning With Outliers","authors":"Heejo Kong;Sung-Jin Kim;Gunho Jung;Seong-Whan Lee","doi":"10.1109/TNNLS.2025.3547801","DOIUrl":null,"url":null,"abstract":"Conventional semi-supervised learning (SSL) ideally assumes that labeled and unlabeled data share an identical class distribution; however, in practice, this assumption is easily violated, as unlabeled data often includes unknown class data, i.e., outliers. The outliers are treated as noise, considerably degrading the performance of SSL models. To address this drawback, we propose a novel framework, diversify and conquer (DAC), to enhance SSL robustness in the context of open-set SSL (OSSL). In particular, we note that existing OSSL methods rely on prediction discrepancies between inliers and outliers from a single model trained on labeled data. This approach can be easily failed when the labeled data are insufficient, leading to performance degradation that is worse than naive SSL that do not account for outliers. In contrast, our approach exploits prediction disagreements among multiple models that are differently biased toward the unlabeled distribution. By leveraging the discrepancies arising from training on unlabeled data, our method enables robust outlier detection, even when the labeled data are underspecified. Our key contribution is constructing a collection of differently biased models through a single training process. By encouraging divergent heads to be differently biased toward outliers while making consistent predictions for inliers, we exploit the disagreement among these heads as a measure to identify unknown concepts. Extensive experiments demonstrate that our method significantly surpasses state-of-the-art OSSL methods across various protocols.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 6","pages":"9879-9892"},"PeriodicalIF":8.9000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10944499","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10944499/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Conventional semi-supervised learning (SSL) ideally assumes that labeled and unlabeled data share an identical class distribution; however, in practice, this assumption is easily violated, as unlabeled data often includes unknown class data, i.e., outliers. The outliers are treated as noise, considerably degrading the performance of SSL models. To address this drawback, we propose a novel framework, diversify and conquer (DAC), to enhance SSL robustness in the context of open-set SSL (OSSL). In particular, we note that existing OSSL methods rely on prediction discrepancies between inliers and outliers from a single model trained on labeled data. This approach can be easily failed when the labeled data are insufficient, leading to performance degradation that is worse than naive SSL that do not account for outliers. In contrast, our approach exploits prediction disagreements among multiple models that are differently biased toward the unlabeled distribution. By leveraging the discrepancies arising from training on unlabeled data, our method enables robust outlier detection, even when the labeled data are underspecified. Our key contribution is constructing a collection of differently biased models through a single training process. By encouraging divergent heads to be differently biased toward outliers while making consistent predictions for inliers, we exploit the disagreement among these heads as a measure to identify unknown concepts. Extensive experiments demonstrate that our method significantly surpasses state-of-the-art OSSL methods across various protocols.
多样化与征服:具有异常值的鲁棒半监督学习的开集分歧
传统的半监督学习(SSL)理想地假设标记和未标记的数据共享相同的类分布;然而,在实践中,这个假设很容易被违背,因为未标记的数据通常包含未知的类数据,即离群值。异常值被视为噪声,这大大降低了SSL模型的性能。为了解决这个缺点,我们提出了一个新的框架,多样化和征服(DAC),以增强开放集SSL (OSSL)环境下的SSL鲁棒性。特别是,我们注意到现有的OSSL方法依赖于在标记数据上训练的单个模型的内线和离群值之间的预测差异。当标记的数据不足时,这种方法很容易失败,导致性能下降,这比没有考虑异常值的原始SSL更糟糕。相比之下,我们的方法利用了多个模型之间的预测分歧,这些模型不同地偏向于未标记的分布。通过利用在未标记数据上训练产生的差异,我们的方法可以实现鲁棒的异常值检测,即使在标记数据未指定的情况下也是如此。我们的主要贡献是通过单个训练过程构建不同偏差模型的集合。通过鼓励不同的头部对异常值有不同的偏见,同时对内线进行一致的预测,我们利用这些头部之间的分歧作为识别未知概念的措施。大量的实验表明,我们的方法在各种协议上明显优于最先进的OSSL方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE transactions on neural networks and learning systems
IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
CiteScore
23.80
自引率
9.60%
发文量
2102
审稿时长
3-8 weeks
期刊介绍: The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信