具有交叉视图对应锚对齐的可伸缩多视图图聚类

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-02-05 DOI:10.1109/TKDE.2025.3538852

Siwei Wang;Xinwang Liu;Qing Liao;Yi Wen;En Zhu;Kunlun He

{"title":"具有交叉视图对应锚对齐的可伸缩多视图图聚类","authors":"Siwei Wang;Xinwang Liu;Qing Liao;Yi Wen;En Zhu;Kunlun He","doi":"10.1109/TKDE.2025.3538852","DOIUrl":null,"url":null,"abstract":"Multi-view graph clustering (MVGC) explores pairwise correlations of entire instances and comprehensively aggregates diverse source information with optimal graph structure. One major issue of practical MVGC is the high time and space complexities prohibiting being applied on large-scale applications. As a promising solution of addressing large-scale problems, anchor-based strategy identifies small portion and key landmarks to serve as replacements for the entire dataset. Despite of its efficiency, anchors chosen across views may be semantically unaligned contrasting to naturally-aligned full sample setting, which may lead to the latter inappropriate graph fusion. Limited attention has been focused on the mentioned Multi-View Anchor-Unaligned Problem (MV-AUP) in the existing literature. In this paper, we first revisit existing multi-view anchor graph clustering frameworks and present the MV-AUP phenomenon. Then, we propose a novel <underline>Multi-view <underline>Corresponding <underline>Anchor <underline>Graph <underline>Alignment <underline>Fusion framework (MV-CAGAF), which elegantly solves MV-AUP with structural representation matching in multi-dimensional spaces. Further, we theoretically prove our proposed structural matching approach can be regarded as minimizing the EMD distance of the two relative anchor distributions. Based on this, we design the innovative multi-view anchor graph fusion paradigm with correspondence alignment, which inherits the linear sample complexity for scalable cross-view clustering. Our proposed MV-CAGAF achieves significant improvements with the help of the novel fusion framework on comprehensive benchmark datasets. Most importantly, the experimental results on both of the simulated and real-world datasets significantly prove the importance of cross-view alignment for large-scale multi-view clustering.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 5","pages":"2932-2945"},"PeriodicalIF":8.9000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scalable Multi-View Graph Clustering With Cross-View Corresponding Anchor Alignment\",\"authors\":\"Siwei Wang;Xinwang Liu;Qing Liao;Yi Wen;En Zhu;Kunlun He\",\"doi\":\"10.1109/TKDE.2025.3538852\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-view graph clustering (MVGC) explores pairwise correlations of entire instances and comprehensively aggregates diverse source information with optimal graph structure. One major issue of practical MVGC is the high time and space complexities prohibiting being applied on large-scale applications. As a promising solution of addressing large-scale problems, anchor-based strategy identifies small portion and key landmarks to serve as replacements for the entire dataset. Despite of its efficiency, anchors chosen across views may be semantically unaligned contrasting to naturally-aligned full sample setting, which may lead to the latter inappropriate graph fusion. Limited attention has been focused on the mentioned Multi-View Anchor-Unaligned Problem (MV-AUP) in the existing literature. In this paper, we first revisit existing multi-view anchor graph clustering frameworks and present the MV-AUP phenomenon. Then, we propose a novel <underline>Multi-view <underline>Corresponding <underline>Anchor <underline>Graph <underline>Alignment <underline>Fusion framework (MV-CAGAF), which elegantly solves MV-AUP with structural representation matching in multi-dimensional spaces. Further, we theoretically prove our proposed structural matching approach can be regarded as minimizing the EMD distance of the two relative anchor distributions. Based on this, we design the innovative multi-view anchor graph fusion paradigm with correspondence alignment, which inherits the linear sample complexity for scalable cross-view clustering. Our proposed MV-CAGAF achieves significant improvements with the help of the novel fusion framework on comprehensive benchmark datasets. Most importantly, the experimental results on both of the simulated and real-world datasets significantly prove the importance of cross-view alignment for large-scale multi-view clustering.\",\"PeriodicalId\":13496,\"journal\":{\"name\":\"IEEE Transactions on Knowledge and Data Engineering\",\"volume\":\"37 5\",\"pages\":\"2932-2945\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-02-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Knowledge and Data Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10874196/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10874196/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

多视图图聚类（Multi-view graph clustering， MVGC）探索整个实例的两两关联，以最优的图结构综合聚合各种源信息。实际MVGC的一个主要问题是高时间和空间复杂性，因此无法应用于大规模应用程序。作为解决大规模问题的一种有前途的解决方案，基于锚点的策略识别一小部分和关键地标来替代整个数据集。尽管其效率很高，但与自然对齐的全样本设置相比，跨视图选择的锚可能在语义上不对齐，这可能导致后者不适当的图融合。现有文献对多视点锚定不对齐问题（MV-AUP）的关注有限。在本文中，我们首先回顾了现有的多视图锚图聚类框架，并提出了MV-AUP现象。然后，我们提出了一种新的多视图对应锚图对齐融合框架（MV-CAGAF），该框架巧妙地解决了多维空间中具有结构表示匹配的MV-AUP问题。进一步，我们从理论上证明了我们提出的结构匹配方法可以被视为最小化两个相对锚分布的EMD距离。在此基础上，设计了具有对应对齐的多视图锚图融合范式，继承了可扩展跨视图聚类的线性样本复杂度。我们提出的MV-CAGAF在综合基准数据集的融合框架的帮助下取得了显著的改进。最重要的是，在模拟和真实数据集上的实验结果都显著地证明了跨视图对齐对于大规模多视图聚类的重要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Scalable Multi-View Graph Clustering With Cross-View Corresponding Anchor Alignment

Multi-view graph clustering (MVGC) explores pairwise correlations of entire instances and comprehensively aggregates diverse source information with optimal graph structure. One major issue of practical MVGC is the high time and space complexities prohibiting being applied on large-scale applications. As a promising solution of addressing large-scale problems, anchor-based strategy identifies small portion and key landmarks to serve as replacements for the entire dataset. Despite of its efficiency, anchors chosen across views may be semantically unaligned contrasting to naturally-aligned full sample setting, which may lead to the latter inappropriate graph fusion. Limited attention has been focused on the mentioned Multi-View Anchor-Unaligned Problem (MV-AUP) in the existing literature. In this paper, we first revisit existing multi-view anchor graph clustering frameworks and present the MV-AUP phenomenon. Then, we propose a novel Multi-view Corresponding Anchor Graph Alignment Fusion framework (MV-CAGAF), which elegantly solves MV-AUP with structural representation matching in multi-dimensional spaces. Further, we theoretically prove our proposed structural matching approach can be regarded as minimizing the EMD distance of the two relative anchor distributions. Based on this, we design the innovative multi-view anchor graph fusion paradigm with correspondence alignment, which inherits the linear sample complexity for scalable cross-view clustering. Our proposed MV-CAGAF achieves significant improvements with the help of the novel fusion framework on comprehensive benchmark datasets. Most importantly, the experimental results on both of the simulated and real-world datasets significantly prove the importance of cross-view alignment for large-scale multi-view clustering.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.