带噪声标签的部分标签学习

Q1 Decision Sciences

Annals of Data Science Pub Date : 2024-07-31 DOI:10.1007/s40745-024-00552-1

Pan Zhao, Long Tang, Zhigeng Pan

{"title":"带噪声标签的部分标签学习","authors":"Pan Zhao, Long Tang, Zhigeng Pan","doi":"10.1007/s40745-024-00552-1","DOIUrl":null,"url":null,"abstract":"<div><p>Partial label learning (PLL) is a particular problem setting within weakly supervised learning. In PLL, each sample corresponds to a candidate label set in which only one label is true. However, in some practical application scenarios, the emergence of label noise can make some candidate sets lose their true labels, leading to a decline in model performance. In this work, a robust training strategy for PLL, derived from the joint training with co-regularization (JoCoR), is proposed to address this issue in PLL. Specifically, the proposed approach constructs two separate PLL models and a joint loss. The joint loss consists of not only two PLL losses but also a co-regularization term measuring the disagreement of the two models. By automatically selecting samples with small joint loss and using them to update the two models, our proposed approach is able to filter more and more suspected samples with noise candidate label sets. Gradually, the robustness of the PLL models to label noise strengthens due to the reduced disagreement of the two models. Experiments are conducted on two state-of-the-art PLL models using benchmark datasets under various noise levels. The results show that the proposed method can effectively stabilize the training process and reduce the model's overfitting to noisy candidate label sets.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 1","pages":"199 - 212"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Partial Label Learning with Noisy Labels\",\"authors\":\"Pan Zhao, Long Tang, Zhigeng Pan\",\"doi\":\"10.1007/s40745-024-00552-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Partial label learning (PLL) is a particular problem setting within weakly supervised learning. In PLL, each sample corresponds to a candidate label set in which only one label is true. However, in some practical application scenarios, the emergence of label noise can make some candidate sets lose their true labels, leading to a decline in model performance. In this work, a robust training strategy for PLL, derived from the joint training with co-regularization (JoCoR), is proposed to address this issue in PLL. Specifically, the proposed approach constructs two separate PLL models and a joint loss. The joint loss consists of not only two PLL losses but also a co-regularization term measuring the disagreement of the two models. By automatically selecting samples with small joint loss and using them to update the two models, our proposed approach is able to filter more and more suspected samples with noise candidate label sets. Gradually, the robustness of the PLL models to label noise strengthens due to the reduced disagreement of the two models. Experiments are conducted on two state-of-the-art PLL models using benchmark datasets under various noise levels. The results show that the proposed method can effectively stabilize the training process and reduce the model's overfitting to noisy candidate label sets.</p></div>\",\"PeriodicalId\":36280,\"journal\":{\"name\":\"Annals of Data Science\",\"volume\":\"12 1\",\"pages\":\"199 - 212\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s40745-024-00552-1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Decision Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Data Science","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s40745-024-00552-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Decision Sciences","Score":null,"Total":0}

引用次数: 0

摘要

部分标签学习（PLL）是弱监督学习中的一个特殊问题。在锁相环中，每个样本对应于一个候选标签集，其中只有一个标签为真。然而，在一些实际应用场景中，标签噪声的出现会使一些候选集失去其真实标签，从而导致模型性能下降。在这项工作中，提出了一种针对PLL的鲁棒训练策略，该策略来源于联合训练和共正则化（JoCoR），以解决PLL中的这个问题。具体来说，该方法构建了两个独立的锁相环模型和一个联合损耗。联合损耗不仅包括两个锁相环损耗，还包括一个度量两个模型不一致的协正则化项。通过自动选择联合损失较小的样本并利用它们更新两个模型，我们提出的方法能够过滤越来越多的带有噪声候选标签集的可疑样本。逐渐地，由于两种模型的分歧减少，锁相环模型对噪声标记的鲁棒性增强。在不同噪声水平下使用基准数据集对两个最先进的锁相环模型进行了实验。结果表明，该方法可以有效地稳定训练过程，减少模型对有噪声候选标签集的过拟合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Partial Label Learning with Noisy Labels

Partial label learning (PLL) is a particular problem setting within weakly supervised learning. In PLL, each sample corresponds to a candidate label set in which only one label is true. However, in some practical application scenarios, the emergence of label noise can make some candidate sets lose their true labels, leading to a decline in model performance. In this work, a robust training strategy for PLL, derived from the joint training with co-regularization (JoCoR), is proposed to address this issue in PLL. Specifically, the proposed approach constructs two separate PLL models and a joint loss. The joint loss consists of not only two PLL losses but also a co-regularization term measuring the disagreement of the two models. By automatically selecting samples with small joint loss and using them to update the two models, our proposed approach is able to filter more and more suspected samples with noise candidate label sets. Gradually, the robustness of the PLL models to label noise strengthens due to the reduced disagreement of the two models. Experiments are conducted on two state-of-the-art PLL models using benchmark datasets under various noise levels. The results show that the proposed method can effectively stabilize the training process and reduce the model's overfitting to noisy candidate label sets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Annals of Data Science Decision Sciences-Statistics, Probability and Uncertainty

CiteScore

6.50

自引率

0.00%

发文量

期刊介绍： Annals of Data Science (ADS) publishes cutting-edge research findings, experimental results and case studies of data science. Although Data Science is regarded as an interdisciplinary field of using mathematics, statistics, databases, data mining, high-performance computing, knowledge management and virtualization to discover knowledge from Big Data, it should have its own scientific contents, such as axioms, laws and rules, which are fundamentally important for experts in different fields to explore their own interests from Big Data. ADS encourages contributors to address such challenging problems at this exchange platform. At present, how to discover knowledge from heterogeneous data under Big Data environment needs to be addressed. ADS is a series of volumes edited by either the editorial office or guest editors. Guest editors will be responsible for call-for-papers and the review process for high-quality contributions in their volumes.