Zhibin Gu, Songhe Feng, Zhendong Li, Jiazheng Yuan, Jun Liu
{"title":"NOODLE: Joint Cross-View Discrepancy Discovery and High-Order Correlation Detection for Multi-View Subspace Clustering","authors":"Zhibin Gu, Songhe Feng, Zhendong Li, Jiazheng Yuan, Jun Liu","doi":"10.1145/3653305","DOIUrl":null,"url":null,"abstract":"<p>Benefiting from the effective exploration of the valuable topological pair-wise relationship of data points across multiple views, multi-view subspace clustering (MVSC) has received increasing attention in recent years. However, we observe that existing MVSC approaches still suffer from two limitations that need to be further improved to enhance the clustering effectiveness. Firstly, previous MVSC approaches mainly prioritize extracting multi-view consistency, often neglecting the cross-view discrepancy that may arise from noise, outliers, and view-inherent properties. Secondly, existing techniques are constrained by their reliance on pair-wise sample correlation and pair-wise view correlation, failing to capture the high-order correlations that are enclosed within multiple views. To address these issues, we propose a novel MVSC framework via joi<b>N</b>t cr<b>O</b>ss-view discrepancy disc<b>O</b>very an<b>D</b> high-order corre<b>L</b>ation d<b>E</b>tection (<b>NOODLE</b>), seeking an informative target subspace representation compatible across multiple features to facilitate the downstream clustering task. Specifically, we first exploit the self-representation mechanism to learn multiple view-specific affinity matrices, which are further decomposed into cohesive factors and incongruous factors to fit the multi-view consistency and discrepancy, respectively. Additionally, an explicit cross-view sparse regularization is applied to incoherent parts, ensuring the consistency and discrepancy to be precisely separated from the initial subspace representations. Meanwhile, the multiple cohesive parts are stacked into a three-dimensional tensor associated with a tensor-Singular Value Decomposition (t-SVD) based weighted tensor nuclear norm constraint, enabling effective detection of the high-order correlations implicit in multi-view data. Our proposed method outperforms state-of-the-art methods for multi-view clustering on six benchmark datasets, demonstrating its effectiveness.</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"23 1","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Knowledge Discovery from Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3653305","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Benefiting from the effective exploration of the valuable topological pair-wise relationship of data points across multiple views, multi-view subspace clustering (MVSC) has received increasing attention in recent years. However, we observe that existing MVSC approaches still suffer from two limitations that need to be further improved to enhance the clustering effectiveness. Firstly, previous MVSC approaches mainly prioritize extracting multi-view consistency, often neglecting the cross-view discrepancy that may arise from noise, outliers, and view-inherent properties. Secondly, existing techniques are constrained by their reliance on pair-wise sample correlation and pair-wise view correlation, failing to capture the high-order correlations that are enclosed within multiple views. To address these issues, we propose a novel MVSC framework via joiNt crOss-view discrepancy discOvery anD high-order correLation dEtection (NOODLE), seeking an informative target subspace representation compatible across multiple features to facilitate the downstream clustering task. Specifically, we first exploit the self-representation mechanism to learn multiple view-specific affinity matrices, which are further decomposed into cohesive factors and incongruous factors to fit the multi-view consistency and discrepancy, respectively. Additionally, an explicit cross-view sparse regularization is applied to incoherent parts, ensuring the consistency and discrepancy to be precisely separated from the initial subspace representations. Meanwhile, the multiple cohesive parts are stacked into a three-dimensional tensor associated with a tensor-Singular Value Decomposition (t-SVD) based weighted tensor nuclear norm constraint, enabling effective detection of the high-order correlations implicit in multi-view data. Our proposed method outperforms state-of-the-art methods for multi-view clustering on six benchmark datasets, demonstrating its effectiveness.
期刊介绍:
TKDD welcomes papers on a full range of research in the knowledge discovery and analysis of diverse forms of data. Such subjects include, but are not limited to: scalable and effective algorithms for data mining and big data analysis, mining brain networks, mining data streams, mining multi-media data, mining high-dimensional data, mining text, Web, and semi-structured data, mining spatial and temporal data, data mining for community generation, social network analysis, and graph structured data, security and privacy issues in data mining, visual, interactive and online data mining, pre-processing and post-processing for data mining, robust and scalable statistical methods, data mining languages, foundations of data mining, KDD framework and process, and novel applications and infrastructures exploiting data mining technology including massively parallel processing and cloud computing platforms. TKDD encourages papers that explore the above subjects in the context of large distributed networks of computers, parallel or multiprocessing computers, or new data devices. TKDD also encourages papers that describe emerging data mining applications that cannot be satisfied by the current data mining technology.