ClusterLP: A novel Cluster-aware Link Prediction model in undirected and directed graphs

IF 3.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Approximate Reasoning Pub Date : 2024-05-24 DOI:10.1016/j.ijar.2024.109216

Shanfan Zhang , Wenjiao Zhang , Zhan Bu , Xia Zhang

{"title":"ClusterLP: A novel Cluster-aware Link Prediction model in undirected and directed graphs","authors":"Shanfan Zhang , Wenjiao Zhang , Zhan Bu , Xia Zhang","doi":"10.1016/j.ijar.2024.109216","DOIUrl":null,"url":null,"abstract":"<div>Link prediction models endeavor to understand the distribution of links within graphs and forecast the presence of potential links. With the advancements in deep learning, prevailing methods typically strive to acquire low-dimensional representations of nodes in networks, aiming to capture and retain the structure and inherent characteristics of networks. However, the majority of these methods primarily focus on preserving the microscopic structure, such as the first- and second-order proximities of nodes, while largely disregarding the mesoscopic cluster structure, which stands out as one of the network's most prominent features. Following the homophily principle, nodes within the same cluster exhibit greater similarity to each other compared to those from different clusters, suggesting that they should possess analogous vertex representations and higher probabilities of linkage. In this study, we develop a straightforward yet efficient Cluster-aware Link Prediction framework (ClusterLP), with the objective of directly leveraging cluster structures to predict links among nodes with maximum accuracy in both undirected and directed graphs. Specifically, we posit that establishing links between nodes with similar representation vectors and cluster tendencies is more feasible in undirected graphs, whereas nodes in directed graphs are inclined to point towards nodes with akin representation vectors and greater influence. We tailor the implementation of ClusterLP for undirected and directed graphs, respectively, and experimental findings using multiple real-world networks demonstrate the high competitiveness of our models in the realm of link prediction tasks. The code utilized in our implementation is accessible at https://github.com/ZINUX1998/ClusterLP<svg><path></path></svg>.</div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"172 ","pages":"Article 109216"},"PeriodicalIF":3.2000,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Approximate Reasoning","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0888613X24001038","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Link prediction models endeavor to understand the distribution of links within graphs and forecast the presence of potential links. With the advancements in deep learning, prevailing methods typically strive to acquire low-dimensional representations of nodes in networks, aiming to capture and retain the structure and inherent characteristics of networks. However, the majority of these methods primarily focus on preserving the microscopic structure, such as the first- and second-order proximities of nodes, while largely disregarding the mesoscopic cluster structure, which stands out as one of the network's most prominent features. Following the homophily principle, nodes within the same cluster exhibit greater similarity to each other compared to those from different clusters, suggesting that they should possess analogous vertex representations and higher probabilities of linkage. In this study, we develop a straightforward yet efficient Cluster-aware Link Prediction framework (ClusterLP), with the objective of directly leveraging cluster structures to predict links among nodes with maximum accuracy in both undirected and directed graphs. Specifically, we posit that establishing links between nodes with similar representation vectors and cluster tendencies is more feasible in undirected graphs, whereas nodes in directed graphs are inclined to point towards nodes with akin representation vectors and greater influence. We tailor the implementation of ClusterLP for undirected and directed graphs, respectively, and experimental findings using multiple real-world networks demonstrate the high competitiveness of our models in the realm of link prediction tasks. The code utilized in our implementation is accessible at https://github.com/ZINUX1998/ClusterLP.

查看原文本刊更多论文

ClusterLP：无向图和有向图中的新型集群感知链接预测模型

链接预测模型致力于了解图中链接的分布情况，并预测潜在链接的存在。随着深度学习技术的发展，目前流行的方法通常致力于获取网络中节点的低维表示，旨在捕捉和保留网络的结构和固有特征。然而，这些方法大多主要侧重于保留微观结构，如节点的一阶和二阶邻近度，而在很大程度上忽略了作为网络最突出特征之一的中观集群结构。根据同质性原理，同一集群中的节点与不同集群中的节点相比具有更大的相似性，这表明它们应该具有相似的顶点表示和更高的链接概率。在本研究中，我们开发了一个简单而高效的集群感知链接预测框架（ClusterLP），目的是直接利用集群结构，在无向图和有向图中最大限度地准确预测节点之间的链接。具体来说，我们认为在无向图中，在具有相似表示向量和聚类倾向的节点之间建立链接更为可行，而有向图中的节点则倾向于指向具有相似表示向量和更大影响力的节点。我们分别针对无向图和有向图定制了 ClusterLP 的实现方法，使用多个真实世界网络的实验结果表明，我们的模型在链接预测任务领域具有很强的竞争力。我们实现过程中使用的代码可在 https://github.com/ZINUX1998/ClusterLP 上访问。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Approximate Reasoning 工程技术-计算机：人工智能

CiteScore

6.90

自引率

12.80%

发文量

170

审稿时长

67 days

期刊介绍： The International Journal of Approximate Reasoning is intended to serve as a forum for the treatment of imprecision and uncertainty in Artificial and Computational Intelligence, covering both the foundations of uncertainty theories, and the design of intelligent systems for scientific and engineering applications. It publishes high-quality research papers describing theoretical developments or innovative applications, as well as review articles on topics of general interest. Relevant topics include, but are not limited to, probabilistic reasoning and Bayesian networks, imprecise probabilities, random sets, belief functions (Dempster-Shafer theory), possibility theory, fuzzy sets, rough sets, decision theory, non-additive measures and integrals, qualitative reasoning about uncertainty, comparative probability orderings, game-theoretic probability, default reasoning, nonstandard logics, argumentation systems, inconsistency tolerant reasoning, elicitation techniques, philosophical foundations and psychological models of uncertain reasoning. Domains of application for uncertain reasoning systems include risk analysis and assessment, information retrieval and database design, information fusion, machine learning, data and web mining, computer vision, image and signal processing, intelligent data analysis, statistics, multi-agent systems, etc.