Ruijia Li , Yingcang Ma , Hong Chen , Xiaofei Yang , Zhiwei Xing
{"title":"基于伪标签学习和流形学习的top-k多标签特征选择的坐标下降","authors":"Ruijia Li , Yingcang Ma , Hong Chen , Xiaofei Yang , Zhiwei Xing","doi":"10.1016/j.neucom.2025.131640","DOIUrl":null,"url":null,"abstract":"<div><div>Multi-label learning plays an increasingly important role in handling complex problems where data instances are associated with multiple labels. However, current methods face significant limitations when dealing with high-dimensional feature spaces. They struggle to preserve the geometric structure among features while failing to fully exploit the latent correlations between labels. To address these key challenges, this paper proposes a novel feature selection method called coordinate descent for top-k multi-label feature selection with pseudo-label learning and manifold learning (CD-MPL), which integrates manifold learning with pseudo-label learning techniques. First, by constructing a feature graph Laplacian matrix, we establish a mathematical representation of the feature manifold structure, effectively preserving the local geometric properties of the feature space. Second, we introduce a pseudo-label learning mechanism, converting discrete binary labels into continuous representations to better model complex label correlations. Notably, to tackle the non-convex optimization problem caused by the <span><math><msub><mi>ℓ</mi><mrow><mn>2</mn><mo>,</mo><mn>0</mn></mrow></msub></math></span>-norm constraint, we innovatively transform the original problem into the joint optimization of a continuous matrix and a discrete selection matrix. We then employ a coordinate descent (CD) method to efficiently solve the selection matrix, overcoming the non-convexity issue while enhancing model performance, interpretability, and practicality. Experimental results on ten multi-label datasets demonstrate that CD-MPL significantly outperforms existing methods across multiple key evaluation metrics, achieving an average performance improvement of 3.31 %. The algorithm maintains stable performance even with reduced feature subsets and exhibits rapid convergence within 10 iterations, fully validating its efficiency and effectiveness in multi-label classification tasks.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"658 ","pages":"Article 131640"},"PeriodicalIF":6.5000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Coordinate descent for top-k multi-label feature selection with pseudo-label learning and manifold learning\",\"authors\":\"Ruijia Li , Yingcang Ma , Hong Chen , Xiaofei Yang , Zhiwei Xing\",\"doi\":\"10.1016/j.neucom.2025.131640\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multi-label learning plays an increasingly important role in handling complex problems where data instances are associated with multiple labels. However, current methods face significant limitations when dealing with high-dimensional feature spaces. They struggle to preserve the geometric structure among features while failing to fully exploit the latent correlations between labels. To address these key challenges, this paper proposes a novel feature selection method called coordinate descent for top-k multi-label feature selection with pseudo-label learning and manifold learning (CD-MPL), which integrates manifold learning with pseudo-label learning techniques. First, by constructing a feature graph Laplacian matrix, we establish a mathematical representation of the feature manifold structure, effectively preserving the local geometric properties of the feature space. Second, we introduce a pseudo-label learning mechanism, converting discrete binary labels into continuous representations to better model complex label correlations. Notably, to tackle the non-convex optimization problem caused by the <span><math><msub><mi>ℓ</mi><mrow><mn>2</mn><mo>,</mo><mn>0</mn></mrow></msub></math></span>-norm constraint, we innovatively transform the original problem into the joint optimization of a continuous matrix and a discrete selection matrix. We then employ a coordinate descent (CD) method to efficiently solve the selection matrix, overcoming the non-convexity issue while enhancing model performance, interpretability, and practicality. Experimental results on ten multi-label datasets demonstrate that CD-MPL significantly outperforms existing methods across multiple key evaluation metrics, achieving an average performance improvement of 3.31 %. The algorithm maintains stable performance even with reduced feature subsets and exhibits rapid convergence within 10 iterations, fully validating its efficiency and effectiveness in multi-label classification tasks.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"658 \",\"pages\":\"Article 131640\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225023124\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225023124","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Coordinate descent for top-k multi-label feature selection with pseudo-label learning and manifold learning
Multi-label learning plays an increasingly important role in handling complex problems where data instances are associated with multiple labels. However, current methods face significant limitations when dealing with high-dimensional feature spaces. They struggle to preserve the geometric structure among features while failing to fully exploit the latent correlations between labels. To address these key challenges, this paper proposes a novel feature selection method called coordinate descent for top-k multi-label feature selection with pseudo-label learning and manifold learning (CD-MPL), which integrates manifold learning with pseudo-label learning techniques. First, by constructing a feature graph Laplacian matrix, we establish a mathematical representation of the feature manifold structure, effectively preserving the local geometric properties of the feature space. Second, we introduce a pseudo-label learning mechanism, converting discrete binary labels into continuous representations to better model complex label correlations. Notably, to tackle the non-convex optimization problem caused by the -norm constraint, we innovatively transform the original problem into the joint optimization of a continuous matrix and a discrete selection matrix. We then employ a coordinate descent (CD) method to efficiently solve the selection matrix, overcoming the non-convexity issue while enhancing model performance, interpretability, and practicality. Experimental results on ten multi-label datasets demonstrate that CD-MPL significantly outperforms existing methods across multiple key evaluation metrics, achieving an average performance improvement of 3.31 %. The algorithm maintains stable performance even with reduced feature subsets and exhibits rapid convergence within 10 iterations, fully validating its efficiency and effectiveness in multi-label classification tasks.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.