{"title":"A Fast Parallel Approach for Neighborhood-Based Link Prediction by Disregarding Large Hubs","authors":"Subhajit Sahu, Kishore Kothapalli","doi":"10.1002/cpe.8331","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Link prediction can help rectify inaccuracies in various graph algorithms, stemming from unaccounted-for or overlooked links within networks. However, many existing works use a baseline approach, which incurs unnecessary computational costs due to its high time complexity. Further, many studies focus on smaller graphs, which can lead to misleading conclusions. Here, we study the prediction of links using neighborhood-based similarity measures on large graphs. In particular, we improve upon the baseline approach (IBase), and propose a heuristic approach that additionally disregards large hubs (DLH), based on the idea that high-degree nodes contribute little similarity among their neighbors. On a server equipped with dual 16-core Intel Xeon Gold 6226R processors, DLH is on average <span></span><math>\n <semantics>\n <mrow>\n <mn>1019</mn>\n <mo>×</mo>\n </mrow>\n <annotation>$$ 1019\\times $$</annotation>\n </semantics></math> faster than IBase, especially on web graphs and social networks, while maintaining similar prediction accuracy. Notably, DLH achieves a link prediction rate of 38.1<i>M</i> edges/s and improves performance by <span></span><math>\n <semantics>\n <mrow>\n <mn>1.6</mn>\n <mo>×</mo>\n </mrow>\n <annotation>$$ 1.6\\times $$</annotation>\n </semantics></math> for every doubling of threads.</p>\n </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 2","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.8331","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Link prediction can help rectify inaccuracies in various graph algorithms, stemming from unaccounted-for or overlooked links within networks. However, many existing works use a baseline approach, which incurs unnecessary computational costs due to its high time complexity. Further, many studies focus on smaller graphs, which can lead to misleading conclusions. Here, we study the prediction of links using neighborhood-based similarity measures on large graphs. In particular, we improve upon the baseline approach (IBase), and propose a heuristic approach that additionally disregards large hubs (DLH), based on the idea that high-degree nodes contribute little similarity among their neighbors. On a server equipped with dual 16-core Intel Xeon Gold 6226R processors, DLH is on average faster than IBase, especially on web graphs and social networks, while maintaining similar prediction accuracy. Notably, DLH achieves a link prediction rate of 38.1M edges/s and improves performance by for every doubling of threads.
期刊介绍:
Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of:
Parallel and distributed computing;
High-performance computing;
Computational and data science;
Artificial intelligence and machine learning;
Big data applications, algorithms, and systems;
Network science;
Ontologies and semantics;
Security and privacy;
Cloud/edge/fog computing;
Green computing; and
Quantum computing.