Siwei Wang;Xinwang Liu;Qing Liao;Yi Wen;En Zhu;Kunlun He
{"title":"Scalable Multi-View Graph Clustering With Cross-View Corresponding Anchor Alignment","authors":"Siwei Wang;Xinwang Liu;Qing Liao;Yi Wen;En Zhu;Kunlun He","doi":"10.1109/TKDE.2025.3538852","DOIUrl":null,"url":null,"abstract":"Multi-view graph clustering (MVGC) explores pairwise correlations of entire instances and comprehensively aggregates diverse source information with optimal graph structure. One major issue of practical MVGC is the high time and space complexities prohibiting being applied on large-scale applications. As a promising solution of addressing large-scale problems, anchor-based strategy identifies small portion and key landmarks to serve as replacements for the entire dataset. Despite of its efficiency, anchors chosen across views may be semantically unaligned contrasting to naturally-aligned full sample setting, which may lead to the latter inappropriate graph fusion. Limited attention has been focused on the mentioned Multi-View Anchor-Unaligned Problem (MV-AUP) in the existing literature. In this paper, we first revisit existing multi-view anchor graph clustering frameworks and present the MV-AUP phenomenon. Then, we propose a novel <underline>M</u>ulti-view <underline>C</u>orresponding <underline>A</u>nchor <underline>G</u>raph <underline>A</u>lignment <underline>F</u>usion framework (MV-CAGAF), which elegantly solves MV-AUP with structural representation matching in multi-dimensional spaces. Further, we theoretically prove our proposed structural matching approach can be regarded as minimizing the EMD distance of the two relative anchor distributions. Based on this, we design the innovative multi-view anchor graph fusion paradigm with correspondence alignment, which inherits the linear sample complexity for scalable cross-view clustering. Our proposed MV-CAGAF achieves significant improvements with the help of the novel fusion framework on comprehensive benchmark datasets. Most importantly, the experimental results on both of the simulated and real-world datasets significantly prove the importance of cross-view alignment for large-scale multi-view clustering.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 5","pages":"2932-2945"},"PeriodicalIF":8.9000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10874196/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-view graph clustering (MVGC) explores pairwise correlations of entire instances and comprehensively aggregates diverse source information with optimal graph structure. One major issue of practical MVGC is the high time and space complexities prohibiting being applied on large-scale applications. As a promising solution of addressing large-scale problems, anchor-based strategy identifies small portion and key landmarks to serve as replacements for the entire dataset. Despite of its efficiency, anchors chosen across views may be semantically unaligned contrasting to naturally-aligned full sample setting, which may lead to the latter inappropriate graph fusion. Limited attention has been focused on the mentioned Multi-View Anchor-Unaligned Problem (MV-AUP) in the existing literature. In this paper, we first revisit existing multi-view anchor graph clustering frameworks and present the MV-AUP phenomenon. Then, we propose a novel Multi-view Corresponding Anchor Graph Alignment Fusion framework (MV-CAGAF), which elegantly solves MV-AUP with structural representation matching in multi-dimensional spaces. Further, we theoretically prove our proposed structural matching approach can be regarded as minimizing the EMD distance of the two relative anchor distributions. Based on this, we design the innovative multi-view anchor graph fusion paradigm with correspondence alignment, which inherits the linear sample complexity for scalable cross-view clustering. Our proposed MV-CAGAF achieves significant improvements with the help of the novel fusion framework on comprehensive benchmark datasets. Most importantly, the experimental results on both of the simulated and real-world datasets significantly prove the importance of cross-view alignment for large-scale multi-view clustering.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.