A versatile framework for attributed network clustering via K-nearest neighbor augmentation

Yiran Li, Gongyao Guo, Jieming Shi, Renchi Yang, Shiqi Shen, Qing Li, Jun Luo
{"title":"A versatile framework for attributed network clustering via K-nearest neighbor augmentation","authors":"Yiran Li, Gongyao Guo, Jieming Shi, Renchi Yang, Shiqi Shen, Qing Li, Jun Luo","doi":"10.1007/s00778-024-00875-8","DOIUrl":null,"url":null,"abstract":"<p>Attributed networks containing entity-specific information in node attributes are ubiquitous in modeling social networks, e-commerce, bioinformatics, etc. Their inherent network topology ranges from simple graphs to hypergraphs with high-order interactions and multiplex graphs with separate layers. An important graph mining task is node clustering, aiming to partition the nodes of an attributed network into <i>k</i> disjoint clusters such that intra-cluster nodes are closely connected and share similar attributes, while inter-cluster nodes are far apart and dissimilar. It is highly challenging to capture multi-hop connections via nodes or attributes for effective clustering on multiple types of attributed networks. In this paper, we first present <span>AHCKA</span> as an efficient approach to <i>attributed hypergraph clustering</i> (AHC). <span>AHCKA</span> includes a carefully-crafted <i>K</i>-nearest neighbor augmentation strategy for the optimized exploitation of attribute information on hypergraphs, a joint hypergraph random walk model to devise an effective AHC objective, and an efficient solver with speedup techniques for the objective optimization. The proposed techniques are extensible to various types of attributed networks, and thus, we develop <span>ANCKA</span> as a versatile attributed network clustering framework, capable of <i>attributed graph clustering</i>, <i>attributed multiplex graph clustering</i>, and AHC. Moreover, we devise <span>ANCKA-GPU</span> with algorithmic designs tailored for GPU acceleration to boost efficiency. We have conducted extensive experiments to compare our methods with 19 competitors on 8 attributed hypergraphs, 16 competitors on 6 attributed graphs, and 16 competitors on 3 attributed multiplex graphs, all demonstrating the superb clustering quality and efficiency of our methods.</p>","PeriodicalId":501532,"journal":{"name":"The VLDB Journal","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The VLDB Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00778-024-00875-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Attributed networks containing entity-specific information in node attributes are ubiquitous in modeling social networks, e-commerce, bioinformatics, etc. Their inherent network topology ranges from simple graphs to hypergraphs with high-order interactions and multiplex graphs with separate layers. An important graph mining task is node clustering, aiming to partition the nodes of an attributed network into k disjoint clusters such that intra-cluster nodes are closely connected and share similar attributes, while inter-cluster nodes are far apart and dissimilar. It is highly challenging to capture multi-hop connections via nodes or attributes for effective clustering on multiple types of attributed networks. In this paper, we first present AHCKA as an efficient approach to attributed hypergraph clustering (AHC). AHCKA includes a carefully-crafted K-nearest neighbor augmentation strategy for the optimized exploitation of attribute information on hypergraphs, a joint hypergraph random walk model to devise an effective AHC objective, and an efficient solver with speedup techniques for the objective optimization. The proposed techniques are extensible to various types of attributed networks, and thus, we develop ANCKA as a versatile attributed network clustering framework, capable of attributed graph clustering, attributed multiplex graph clustering, and AHC. Moreover, we devise ANCKA-GPU with algorithmic designs tailored for GPU acceleration to boost efficiency. We have conducted extensive experiments to compare our methods with 19 competitors on 8 attributed hypergraphs, 16 competitors on 6 attributed graphs, and 16 competitors on 3 attributed multiplex graphs, all demonstrating the superb clustering quality and efficiency of our methods.

Abstract Image

通过 K 近邻增强实现属性网络聚类的多功能框架
在社交网络、电子商务、生物信息学等建模领域,节点属性中包含特定实体信息的属性网络无处不在。其固有的网络拓扑结构既有简单的图,也有高阶交互的超图,还有分层的多图。节点聚类是一项重要的图挖掘任务,其目的是将归属网络的节点划分为 k 个互不相交的簇,使簇内节点紧密相连并具有相似的属性,而簇间节点则相距甚远、互不相似。如何通过节点或属性捕捉多跳连接,从而对多种类型的归属网络进行有效聚类,是一项极具挑战性的工作。在本文中,我们首先提出了 AHCKA 作为归属超图聚类(AHC)的有效方法。AHCKA 包括一个精心设计的 K 近邻增强策略,用于优化利用超图上的属性信息;一个联合超图随机行走模型,用于设计有效的 AHC 目标;以及一个高效求解器,用于目标优化的加速技术。所提出的技术可扩展到各种类型的属性网络,因此,我们将 ANCKA 开发成了一个通用的属性网络聚类框架,能够进行属性图聚类、属性多重图聚类和 AHC。此外,我们还针对 GPU 加速设计了 ANCKA-GPU 算法,以提高效率。我们进行了大量实验,在 8 个归属超图、6 个归属图和 3 个归属复用图上,将我们的方法与 19 个竞争对手进行了比较,结果表明我们的方法具有极高的聚类质量和效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信