OTAMatch: Optimal Transport Assignment With PseudoNCE for Semi-Supervised Learning

Jinjin Zhang;Junjie Liu;Debang Li;Qiuyu Huang;Jiaxin Chen;Di Huang
{"title":"OTAMatch: Optimal Transport Assignment With PseudoNCE for Semi-Supervised Learning","authors":"Jinjin Zhang;Junjie Liu;Debang Li;Qiuyu Huang;Jiaxin Chen;Di Huang","doi":"10.1109/TIP.2024.3425174","DOIUrl":null,"url":null,"abstract":"In semi-supervised learning (SSL), many approaches follow the effective self-training paradigm with consistency regularization, utilizing threshold heuristics to alleviate label noise. However, such threshold heuristics lead to the underutilization of crucial discriminative information from the excluded data. In this paper, we present OTAMatch, a novel SSL framework that reformulates pseudo-labeling as an optimal transport (OT) assignment problem and simultaneously exploits data with high confidence to mitigate the confirmation bias. Firstly, OTAMatch models the pseudo-label allocation task as a convex minimization problem, facilitating end-to-end optimization with all pseudo-labels and employing the Sinkhorn-Knopp algorithm for efficient approximation. Meanwhile, we incorporate epsilon-greedy posterior regularization and curriculum bias correction strategies to constrain the distribution of OT assignments, improving the robustness with noisy pseudo-labels. Secondly, we propose PseudoNCE, which explicitly exploits pseudo-label consistency with threshold heuristics to maximize mutual information within self-training, significantly boosting the balance of convergence speed and performance. Consequently, our proposed approach achieves competitive performance on various SSL benchmarks. Specifically, OTAMatch substantially outperforms the previous state-of-the-art SSL algorithms in realistic and challenging scenarios, exemplified by a no\n<xref>table 9</xref>\n.45% error rate reduction over SoftMatch on ImageNet with 100K-label split, underlining its robustness and effectiveness.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10599208/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In semi-supervised learning (SSL), many approaches follow the effective self-training paradigm with consistency regularization, utilizing threshold heuristics to alleviate label noise. However, such threshold heuristics lead to the underutilization of crucial discriminative information from the excluded data. In this paper, we present OTAMatch, a novel SSL framework that reformulates pseudo-labeling as an optimal transport (OT) assignment problem and simultaneously exploits data with high confidence to mitigate the confirmation bias. Firstly, OTAMatch models the pseudo-label allocation task as a convex minimization problem, facilitating end-to-end optimization with all pseudo-labels and employing the Sinkhorn-Knopp algorithm for efficient approximation. Meanwhile, we incorporate epsilon-greedy posterior regularization and curriculum bias correction strategies to constrain the distribution of OT assignments, improving the robustness with noisy pseudo-labels. Secondly, we propose PseudoNCE, which explicitly exploits pseudo-label consistency with threshold heuristics to maximize mutual information within self-training, significantly boosting the balance of convergence speed and performance. Consequently, our proposed approach achieves competitive performance on various SSL benchmarks. Specifically, OTAMatch substantially outperforms the previous state-of-the-art SSL algorithms in realistic and challenging scenarios, exemplified by a no table 9 .45% error rate reduction over SoftMatch on ImageNet with 100K-label split, underlining its robustness and effectiveness.
OTAMatch:利用伪 NCE 进行半监督学习的最佳传输分配。
在半监督学习(SSL)中,许多方法都遵循有效的自我训练范式,利用一致性正则化、阈值启发法来减轻标签噪声。然而,这种阈值启发式方法会导致未充分利用排除数据中的关键判别信息。在本文中,我们提出了 OTAMatch,这是一种新颖的 SSL 框架,它将伪标签重新表述为最优传输(OT)分配问题,并同时利用高置信度数据来减轻确认偏差。首先,OTAMatch 将伪标签分配任务建模为一个凸最小化问题,便于使用所有伪标签进行端到端优化,并采用 Sinkhorn-Knopp 算法进行高效逼近。同时,我们采用ε-贪婪后验正则化和课程偏差校正策略来约束 OT 分配的分布,提高了噪声伪标签的鲁棒性。其次,我们提出了 PseudoNCE,它明确利用伪标签的一致性和阈值启发法,在自我训练中最大化互信息,显著提高了收敛速度和性能之间的平衡。因此,我们提出的方法在各种 SSL 基准上都取得了具有竞争力的性能。具体来说,OTAMatch 在现实和具有挑战性的场景中的表现大大优于之前最先进的 SSL 算法,例如,在 100K 标签分割的 ImageNet 上,OTAMatch 比 SoftMatch 明显降低了 9.45% 的错误率,凸显了其鲁棒性和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信