Few-shot partial multi-label learning with credible non-candidate label

IF 6.8 1区计算机科学 0 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Sciences Pub Date : 2025-07-11 DOI:10.1016/j.ins.2025.122485

Meng Wang , Yunfeng Zhao , Zhongmin Yan , Jinglin Zhang , Jun Wang , Guoxian Yu

{"title":"Few-shot partial multi-label learning with credible non-candidate label","authors":"Meng Wang , Yunfeng Zhao , Zhongmin Yan , Jinglin Zhang , Jun Wang , Guoxian Yu","doi":"10.1016/j.ins.2025.122485","DOIUrl":null,"url":null,"abstract":"<div><div>Partial multi-label learning (PML) addresses scenarios where each training sample is associated with multiple candidate labels, but only a subset are ground-truth labels. The primary difficulty in PML is to mitigate the negative impact of noisy labels. Most existing PML methods rely on sufficient samples to train a noise-robust multi-label classifier. However, in practical scenarios, such as privacy-sensitive domains or those with limited data, only a few training samples are typically available for the target task. In this paper, we propose an approach called <span>FsPML-CNL</span> (Few-shot Partial Multi-label Learning with Credible Non-candidate Label) to tackle the PML problem with few-shot training samples. Specifically, <span>FsPML-CNL</span> first utilizes the sample features and feature-prototype similarity in the embedding space to disambiguate candidate labels and to obtain label prototypes. Then, the credible non-candidate label is selected based on label correlation and confidence, and its prototype is incorporated into the training samples to generate new data for boosting supervised information. The noise-tolerant multi-label classifier is finally induced with the original and generated samples, along with the confidence-guided loss. Extensive experiments on public datasets demonstrate that <span>FsPML-CNL</span> outperforms competitive baselines across different settings.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"719 ","pages":"Article 122485"},"PeriodicalIF":6.8000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025525006176","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Partial multi-label learning (PML) addresses scenarios where each training sample is associated with multiple candidate labels, but only a subset are ground-truth labels. The primary difficulty in PML is to mitigate the negative impact of noisy labels. Most existing PML methods rely on sufficient samples to train a noise-robust multi-label classifier. However, in practical scenarios, such as privacy-sensitive domains or those with limited data, only a few training samples are typically available for the target task. In this paper, we propose an approach called FsPML-CNL (Few-shot Partial Multi-label Learning with Credible Non-candidate Label) to tackle the PML problem with few-shot training samples. Specifically, FsPML-CNL first utilizes the sample features and feature-prototype similarity in the embedding space to disambiguate candidate labels and to obtain label prototypes. Then, the credible non-candidate label is selected based on label correlation and confidence, and its prototype is incorporated into the training samples to generate new data for boosting supervised information. The noise-tolerant multi-label classifier is finally induced with the original and generated samples, along with the confidence-guided loss. Extensive experiments on public datasets demonstrate that FsPML-CNL outperforms competitive baselines across different settings.

Abstract Image

查看原文本刊更多论文

具有可信非候选标签的少射部分多标签学习

部分多标签学习（PML）解决了每个训练样本与多个候选标签相关联的场景，但只有一个子集是基础真值标签。PML的主要困难是减轻噪声标签的负面影响。大多数现有的PML方法依赖于足够的样本来训练噪声鲁棒的多标签分类器。然而，在实际场景中，例如隐私敏感领域或数据有限的领域，通常只有少数训练样本可用于目标任务。在本文中，我们提出了一种称为FsPML-CNL （Few-shot Partial Multi-label Learning with Credible Non-candidate Label）的方法来解决使用少量训练样本的PML问题。具体而言，FsPML-CNL首先利用嵌入空间中的样本特征和特征-原型相似度来消除候选标签的歧义并获得标签原型。然后，基于标签相关性和置信度选择可信的非候选标签，并将其原型纳入训练样本中生成新数据以增强监督信息。最后利用原始样本和生成样本以及置信度引导损失诱导出耐噪声多标签分类器。在公共数据集上进行的大量实验表明，FsPML-CNL在不同设置下的表现优于竞争性基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Sciences 工程技术-计算机：信息系统

CiteScore

14.00

自引率

17.30%

发文量

1322

审稿时长

10.4 months

期刊介绍： Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions. Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.