Jingyu Wang , Mingqing Liu , Feiping Nie , Xuelong Li
{"title":"具有样本外扩展的归一化切共聚类","authors":"Jingyu Wang , Mingqing Liu , Feiping Nie , Xuelong Li","doi":"10.1016/j.patcog.2025.111881","DOIUrl":null,"url":null,"abstract":"<div><div>Co-clustering is a critical data mining technology in various real-world applications, where anchor-based methods reveal the dual relationships between samples and anchors. Due to the information loss caused by relaxation and post-processing, classical anchor-based methods suffer from potential performance degradation. To overcome this disadvantage, we propose a Normalized Cut Co-Clustering (<span><math><msup><mrow><mi>NC</mi></mrow><mrow><mn>3</mn></mrow></msup></math></span>) model, which assigns clusters for samples and anchors by alternatively updating the discrete label matrices. Different from traditional anchor-based co-clustering methods, our model solves the original discrete normalized cut problem on the bipartite graph directly. To address the discrete cut problem, an iterative coordinate ascent algorithm is presented, which can speed up the clustering process. Through optimization on the label matrices of samples and anchors, the clusters can be obtained without relaxation–discretization operation. Furthermore, the proposed <span><math><msup><mrow><mi>NC</mi></mrow><mrow><mn>3</mn></mrow></msup></math></span> model can tackle the out-of-sample clustering issue based on labels of anchors. Through extensive experiments, we validate the effectiveness of our model, achieving competitive results compared to state-of-the-art approaches.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"169 ","pages":"Article 111881"},"PeriodicalIF":7.6000,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Normalized cut co-clustering with out-of-sample extension\",\"authors\":\"Jingyu Wang , Mingqing Liu , Feiping Nie , Xuelong Li\",\"doi\":\"10.1016/j.patcog.2025.111881\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Co-clustering is a critical data mining technology in various real-world applications, where anchor-based methods reveal the dual relationships between samples and anchors. Due to the information loss caused by relaxation and post-processing, classical anchor-based methods suffer from potential performance degradation. To overcome this disadvantage, we propose a Normalized Cut Co-Clustering (<span><math><msup><mrow><mi>NC</mi></mrow><mrow><mn>3</mn></mrow></msup></math></span>) model, which assigns clusters for samples and anchors by alternatively updating the discrete label matrices. Different from traditional anchor-based co-clustering methods, our model solves the original discrete normalized cut problem on the bipartite graph directly. To address the discrete cut problem, an iterative coordinate ascent algorithm is presented, which can speed up the clustering process. Through optimization on the label matrices of samples and anchors, the clusters can be obtained without relaxation–discretization operation. Furthermore, the proposed <span><math><msup><mrow><mi>NC</mi></mrow><mrow><mn>3</mn></mrow></msup></math></span> model can tackle the out-of-sample clustering issue based on labels of anchors. Through extensive experiments, we validate the effectiveness of our model, achieving competitive results compared to state-of-the-art approaches.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"169 \",\"pages\":\"Article 111881\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325005412\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325005412","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Normalized cut co-clustering with out-of-sample extension
Co-clustering is a critical data mining technology in various real-world applications, where anchor-based methods reveal the dual relationships between samples and anchors. Due to the information loss caused by relaxation and post-processing, classical anchor-based methods suffer from potential performance degradation. To overcome this disadvantage, we propose a Normalized Cut Co-Clustering () model, which assigns clusters for samples and anchors by alternatively updating the discrete label matrices. Different from traditional anchor-based co-clustering methods, our model solves the original discrete normalized cut problem on the bipartite graph directly. To address the discrete cut problem, an iterative coordinate ascent algorithm is presented, which can speed up the clustering process. Through optimization on the label matrices of samples and anchors, the clusters can be obtained without relaxation–discretization operation. Furthermore, the proposed model can tackle the out-of-sample clustering issue based on labels of anchors. Through extensive experiments, we validate the effectiveness of our model, achieving competitive results compared to state-of-the-art approaches.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.