基于全局分解和局部约束的半监督标签分布学习

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2025-07-19 DOI:10.1016/j.neucom.2025.131024

Peiqiu Yu, Xiuyi Jia

{"title":"基于全局分解和局部约束的半监督标签分布学习","authors":"Peiqiu Yu, Xiuyi Jia","doi":"10.1016/j.neucom.2025.131024","DOIUrl":null,"url":null,"abstract":"<div><div>In label distribution learning, properly handling samples with missing label distributions is a particularly challenging task. When dealing with unlabeled samples, leveraging correlation is especially crucial as it reveals the intrinsic patterns of data distribution and effectively reduces the model’s hypothesis space. Currently, semi-supervised label distribution learning follows the same correlation mining methods as those used under complete supervision. However, due to the lack of supervision information for some samples, these methods designed for complete supervision are insufficient in a semi-supervised context. On one hand, the absence of labels for some samples makes it difficult to mine label correlations; on the other hand, label correlations mined solely based on samples are biased, leading to imprecise label correlations due to the missing labels. To address these issues, this paper innovatively proposes two strategies for mining label correlations in semi-supervised label distribution learning: first, exploring the common correlations between known and unknown label distributions; second, using the information of known label distributions to reveal the correlations of unknown label distributions. Specifically, globally, we employ independent component analysis for matrix completion of missing sample labels, and locally, we improve the <span><math><mi>k</mi></math></span>-NN framework to utilize the label constraints of known label distributions to restrict the label distribution values of unknown label distributions. Based on these mined correlations, we designed a semi-supervised label distribution learning algorithm. The algorithm outperforms existing methods in 67.27 % of cases, achieving outstanding performance, and demonstrates significant statistical significance in two-sample <span><math><mi>t</mi></math></span>-tests.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"652 ","pages":"Article 131024"},"PeriodicalIF":5.5000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semi-supervised label distribution learning via global factorization and local constrain\",\"authors\":\"Peiqiu Yu, Xiuyi Jia\",\"doi\":\"10.1016/j.neucom.2025.131024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In label distribution learning, properly handling samples with missing label distributions is a particularly challenging task. When dealing with unlabeled samples, leveraging correlation is especially crucial as it reveals the intrinsic patterns of data distribution and effectively reduces the model’s hypothesis space. Currently, semi-supervised label distribution learning follows the same correlation mining methods as those used under complete supervision. However, due to the lack of supervision information for some samples, these methods designed for complete supervision are insufficient in a semi-supervised context. On one hand, the absence of labels for some samples makes it difficult to mine label correlations; on the other hand, label correlations mined solely based on samples are biased, leading to imprecise label correlations due to the missing labels. To address these issues, this paper innovatively proposes two strategies for mining label correlations in semi-supervised label distribution learning: first, exploring the common correlations between known and unknown label distributions; second, using the information of known label distributions to reveal the correlations of unknown label distributions. Specifically, globally, we employ independent component analysis for matrix completion of missing sample labels, and locally, we improve the <span><math><mi>k</mi></math></span>-NN framework to utilize the label constraints of known label distributions to restrict the label distribution values of unknown label distributions. Based on these mined correlations, we designed a semi-supervised label distribution learning algorithm. The algorithm outperforms existing methods in 67.27 % of cases, achieving outstanding performance, and demonstrates significant statistical significance in two-sample <span><math><mi>t</mi></math></span>-tests.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"652 \",\"pages\":\"Article 131024\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225016960\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225016960","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在标签分布学习中，正确处理缺少标签分布的样本是一项特别具有挑战性的任务。在处理未标记样本时，利用相关性尤为重要，因为它揭示了数据分布的内在模式，并有效地减少了模型的假设空间。目前，半监督标签分布学习采用与完全监督下相同的关联挖掘方法。然而，由于一些样本缺乏监管信息，这些为完全监管而设计的方法在半监管环境下是不够的。一方面，一些样本缺少标签使得难以挖掘标签相关性；另一方面，仅基于样本挖掘的标签相关性是有偏差的，由于缺少标签而导致不精确的标签相关性。为了解决这些问题，本文创新性地提出了半监督标签分布学习中标签相关性挖掘的两种策略：首先，探索已知和未知标签分布之间的共同相关性；其次，利用已知标签分布的信息揭示未知标签分布的相关性。具体而言，在全局上，我们采用独立分量分析来完成缺失样本标签的矩阵补全；在局部上，我们改进k-NN框架，利用已知标签分布的标签约束来限制未知标签分布的标签分布值。基于这些挖掘的相关性，我们设计了一种半监督标签分布学习算法。算法在67.27 %的情况下优于现有方法，取得了出色的性能，在两样本t检验中具有显著的统计学意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Semi-supervised label distribution learning via global factorization and local constrain

In label distribution learning, properly handling samples with missing label distributions is a particularly challenging task. When dealing with unlabeled samples, leveraging correlation is especially crucial as it reveals the intrinsic patterns of data distribution and effectively reduces the model’s hypothesis space. Currently, semi-supervised label distribution learning follows the same correlation mining methods as those used under complete supervision. However, due to the lack of supervision information for some samples, these methods designed for complete supervision are insufficient in a semi-supervised context. On one hand, the absence of labels for some samples makes it difficult to mine label correlations; on the other hand, label correlations mined solely based on samples are biased, leading to imprecise label correlations due to the missing labels. To address these issues, this paper innovatively proposes two strategies for mining label correlations in semi-supervised label distribution learning: first, exploring the common correlations between known and unknown label distributions; second, using the information of known label distributions to reveal the correlations of unknown label distributions. Specifically, globally, we employ independent component analysis for matrix completion of missing sample labels, and locally, we improve the

k

-NN framework to utilize the label constraints of known label distributions to restrict the label distribution values of unknown label distributions. Based on these mined correlations, we designed a semi-supervised label distribution learning algorithm. The algorithm outperforms existing methods in 67.27 % of cases, achieving outstanding performance, and demonstrates significant statistical significance in two-sample

t

-tests.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.