Parallel Heuristics for Scalable Community Detection

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI:10.1109/IPDPSW.2014.155

Hao Lu, M. Halappanavar, A. Kalyanaraman, Sutanay Choudhury

{"title":"Parallel Heuristics for Scalable Community Detection","authors":"Hao Lu, M. Halappanavar, A. Kalyanaraman, Sutanay Choudhury","doi":"10.1109/IPDPSW.2014.155","DOIUrl":null,"url":null,"abstract":"Community detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to the irregular and inherently sequential nature of the underlying heuristics. In this paper, we present parallelization heuristics for fast community detection using the Louvain method as the serial template. The Louvain method is an iterative heuristic for modularity optimization. Originally developed by Blondel et al. in 2008, the method has become increasingly popular owing to its ability to detect high modularity community partitions in a fast and memory-efficient manner. However, the method is also inherently sequential, thereby limiting its scalability. Here, we observe certain key properties of this method that present challenges for its parallelization, and consequently propose heuristics that are designed to break the sequential barrier. For evaluation purposes, we implemented our heuristics using OpenMP multithreading, and tested them over real world graphs derived from multiple application domains (e.g., internet, citation, biological). Compared to the serial Louvain implementation, our parallel implementation is able to produce community outputs with a higher modularity for most of the inputs tested, in comparable number of iterations, while providing real speedups of up to 8× using 32 threads. In addition, our parallel implementation was able to exhibit weak scaling properties on up to 32 threads.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"249 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"160","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2014.155","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 160

Abstract

Community detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to the irregular and inherently sequential nature of the underlying heuristics. In this paper, we present parallelization heuristics for fast community detection using the Louvain method as the serial template. The Louvain method is an iterative heuristic for modularity optimization. Originally developed by Blondel et al. in 2008, the method has become increasingly popular owing to its ability to detect high modularity community partitions in a fast and memory-efficient manner. However, the method is also inherently sequential, thereby limiting its scalability. Here, we observe certain key properties of this method that present challenges for its parallelization, and consequently propose heuristics that are designed to break the sequential barrier. For evaluation purposes, we implemented our heuristics using OpenMP multithreading, and tested them over real world graphs derived from multiple application domains (e.g., internet, citation, biological). Compared to the serial Louvain implementation, our parallel implementation is able to produce community outputs with a higher modularity for most of the inputs tested, in comparable number of iterations, while providing real speedups of up to 8× using 32 threads. In addition, our parallel implementation was able to exhibit weak scaling properties on up to 32 threads.

查看原文本刊更多论文

可扩展社区检测的并行启发式算法

社区检测已经成为众多图论应用中的一项基本操作。它用于揭示现实世界网络中存在的自然划分，而不会对社区集施加先验大小或基数限制。尽管它具有应用潜力，但在大规模并行计算机上对社区检测的支持有限，这主要是由于底层启发式的不规则和固有的顺序性质。本文提出了以Louvain方法为串行模板的并行化启发式快速社区检测方法。Louvain方法是一种迭代启发式的模块化优化方法。该方法最初是由Blondel等人在2008年开发的，由于能够以快速和内存高效的方式检测高模块化社区分区，因此该方法越来越受欢迎。然而，该方法本身也是顺序的，因此限制了它的可伸缩性。在这里，我们观察到该方法的某些关键属性对其并行化提出了挑战，因此提出了旨在打破顺序障碍的启发式方法。出于评估的目的，我们使用OpenMP多线程实现了我们的启发式算法，并在来自多个应用领域(例如，互联网、引文、生物)的真实世界图上进行了测试。与串行Louvain实现相比，我们的并行实现能够在相当数量的迭代中为大多数测试的输入产生具有更高模块化的社区输出，同时使用32个线程提供高达8倍的实际速度。此外，我们的并行实现能够在最多32个线程上表现出较弱的伸缩特性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 IEEE International Parallel & Distributed Processing Symposium Workshops

自引率

0.00%

发文量