Bogumił Kamiński , Tomasz Olczak , Bartosz Pankratz , Paweł Prałat , François Théberge
{"title":"Properties and Performance of the ABCDe Random Graph Model with Community Structure","authors":"Bogumił Kamiński , Tomasz Olczak , Bartosz Pankratz , Paweł Prałat , François Théberge","doi":"10.1016/j.bdr.2022.100348","DOIUrl":null,"url":null,"abstract":"<div><p>In this paper, we investigate properties and performance of synthetic random graph models with a built-in community structure. Such models are important for evaluating and tuning community detection algorithms that are unsupervised by nature. We propose <strong>ABCDe</strong>—a multi-threaded implementation of the <strong>ABCD</strong> (Artificial Benchmark for Community Detection) graph generator. We discuss the implementation details of the algorithm and compare it with both the previously available sequential version of the <strong>ABCD</strong> model and with the parallel implementation of the standard and extensively used <strong>LFR</strong> (Lancichinetti–Fortunato–Radicchi) generator. We show that <strong>ABCDe</strong> is more than ten times faster and scales better than the parallel implementation of <strong>LFR</strong> provided in <span>NetworKit</span>. Moreover, the algorithm is not only faster but random graphs generated by <strong>ABCD</strong> have similar properties to the ones generated by the original <strong>LFR</strong> algorithm, while the parallelized <span>NetworKit</span> implementation of <strong>LFR</strong> produces graphs that have noticeably different characteristics.</p></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"30 ","pages":"Article 100348"},"PeriodicalIF":3.5000,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2214579622000429/pdfft?md5=5b249e2f347f9c9eeb348b655a88cf99&pid=1-s2.0-S2214579622000429-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data Research","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214579622000429","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we investigate properties and performance of synthetic random graph models with a built-in community structure. Such models are important for evaluating and tuning community detection algorithms that are unsupervised by nature. We propose ABCDe—a multi-threaded implementation of the ABCD (Artificial Benchmark for Community Detection) graph generator. We discuss the implementation details of the algorithm and compare it with both the previously available sequential version of the ABCD model and with the parallel implementation of the standard and extensively used LFR (Lancichinetti–Fortunato–Radicchi) generator. We show that ABCDe is more than ten times faster and scales better than the parallel implementation of LFR provided in NetworKit. Moreover, the algorithm is not only faster but random graphs generated by ABCD have similar properties to the ones generated by the original LFR algorithm, while the parallelized NetworKit implementation of LFR produces graphs that have noticeably different characteristics.
本文研究了具有内置社团结构的合成随机图模型的性质和性能。这样的模型对于评估和调优自然不受监督的社区检测算法非常重要。我们提出abcde -一个ABCD (Artificial Benchmark for Community Detection)图生成器的多线程实现。我们讨论了该算法的实现细节,并将其与先前可用的ABCD模型的顺序版本以及标准和广泛使用的LFR (Lancichinetti-Fortunato-Radicchi)生成器的并行实现进行了比较。我们证明ABCDe比NetworKit中提供的LFR并行实现快十倍以上,并且可扩展性更好。此外,该算法不仅速度更快,而且ABCD生成的随机图与原始LFR算法生成的随机图具有相似的性质,而LFR的并行化NetworKit实现产生的图具有明显不同的特征。
期刊介绍:
The journal aims to promote and communicate advances in big data research by providing a fast and high quality forum for researchers, practitioners and policy makers from the very many different communities working on, and with, this topic.
The journal will accept papers on foundational aspects in dealing with big data, as well as papers on specific Platforms and Technologies used to deal with big data. To promote Data Science and interdisciplinary collaboration between fields, and to showcase the benefits of data driven research, papers demonstrating applications of big data in domains as diverse as Geoscience, Social Web, Finance, e-Commerce, Health Care, Environment and Climate, Physics and Astronomy, Chemistry, life sciences and drug discovery, digital libraries and scientific publications, security and government will also be considered. Occasionally the journal may publish whitepapers on policies, standards and best practices.