{"title":"Tensor-based incomplete multiple kernel clustering with auto-weighted late fusion alignment","authors":"Xiaoxing Guo, Gui-Fu Lu","doi":"10.1016/j.patcog.2025.111601","DOIUrl":null,"url":null,"abstract":"<div><div>In the era of big data, the rapid increase in data volume is accompanied by substantial missing data issues. Incomplete multiple kernel clustering (IMKC) investigates how to perform clustering when certain rows or columns of the predefined kernel matrix are missing. Among existing IMKC methods, the recent proposed late fusion IMKC (LF-IMKC) algorithm has garnered considerable attention due to its superior clustering accuracy and computational efficiency. However, existing LF-IMKC algorithms still suffer from several limitations. Firstly, we observe that in existing methods, the missing kernel imputation, kernel partition learning and subsequent late fusion processes are treated separately, which may lead to suboptimal solutions and adversely affect the clustering performance. Secondly, existing LF-IMKC algorithms treat each base partition equally, overlooking the differences in their contributions to the consistent clustering process. Thirdly, Existing algorithms typically overlook the higher-order correlations between the base partitions as well as the strong correlations between the base and consensus partitions, let alone leveraging these correlations for clustering. To address these issues, we propose a novel method, i.e., tensor-based incomplete multiple kernel clustering with auto-weighted late fusion alignment (TIKC-ALFA). Specifically, we first integrate the missing kernel imputation, base partition learning and subsequent late fusion processes within a unified framework. Secondly, we construct a third-order tensor using the weighted base partitions, offering an innovative perspective on tensor slices through the lens of weight distribution and then utilize the tensor nuclear norm (TNN) to approximate the true rank of the tensor. Furthermore, we incorporate the consensus partition into the tensor structure originally constructed solely from weighted base partitions to further investigate the strong correlations between the base partitions and the consensus partition. The experimental results on six commonly used datasets demonstrate the effectiveness of our algorithm.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"164 ","pages":"Article 111601"},"PeriodicalIF":7.5000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325002614","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In the era of big data, the rapid increase in data volume is accompanied by substantial missing data issues. Incomplete multiple kernel clustering (IMKC) investigates how to perform clustering when certain rows or columns of the predefined kernel matrix are missing. Among existing IMKC methods, the recent proposed late fusion IMKC (LF-IMKC) algorithm has garnered considerable attention due to its superior clustering accuracy and computational efficiency. However, existing LF-IMKC algorithms still suffer from several limitations. Firstly, we observe that in existing methods, the missing kernel imputation, kernel partition learning and subsequent late fusion processes are treated separately, which may lead to suboptimal solutions and adversely affect the clustering performance. Secondly, existing LF-IMKC algorithms treat each base partition equally, overlooking the differences in their contributions to the consistent clustering process. Thirdly, Existing algorithms typically overlook the higher-order correlations between the base partitions as well as the strong correlations between the base and consensus partitions, let alone leveraging these correlations for clustering. To address these issues, we propose a novel method, i.e., tensor-based incomplete multiple kernel clustering with auto-weighted late fusion alignment (TIKC-ALFA). Specifically, we first integrate the missing kernel imputation, base partition learning and subsequent late fusion processes within a unified framework. Secondly, we construct a third-order tensor using the weighted base partitions, offering an innovative perspective on tensor slices through the lens of weight distribution and then utilize the tensor nuclear norm (TNN) to approximate the true rank of the tensor. Furthermore, we incorporate the consensus partition into the tensor structure originally constructed solely from weighted base partitions to further investigate the strong correlations between the base partitions and the consensus partition. The experimental results on six commonly used datasets demonstrate the effectiveness of our algorithm.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.