Parallel Clique Counting and Peeling Algorithms

Proceedings of the 2021 SIAM Conference on Applied and Computational Discrete Algorithms. SIAM Conference on Applied and Computational Discrete Algorithms (2021 : Online) Pub Date : 2020-02-24 DOI:10.1137/1.9781611976830.13

Jessica Shi, Laxman Dhulipala, Julian Shun

{"title":"Parallel Clique Counting and Peeling Algorithms","authors":"Jessica Shi, Laxman Dhulipala, Julian Shun","doi":"10.1137/1.9781611976830.13","DOIUrl":null,"url":null,"abstract":"Dense subgraphs capture strong communities in social networks and entities possessing strong interactions in biological networks. In particular, $k$-clique counting and listing have applications in identifying important actors in a graph. However, finding $k$-cliques is computationally expensive, and thus it is important to have fast parallel algorithms. \nWe present a new parallel algorithm for $k$-clique counting that has polylogarithmic span and is work-efficient with respect to the well-known sequential algorithm for $k$-clique listing by Chiba and Nishizeki. Our algorithm can be extended to support listing and enumeration, and is based on computing low out-degree orientations. We present a new linear-work and polylogarithmic span algorithm for computing such orientations, and new parallel algorithms for producing unbiased estimations of clique counts. Finally, we design new parallel work-efficient algorithms for approximating the $k$-clique densest subgraph. Our first algorithm gives a $1/k$-approximation and is based on iteratively peeling vertices with the lowest clique counts; our algorithm is work-efficient, but we prove that this process is P-complete and hence does not have polylogarithmic span. Our second algorithm gives a $1/(k(1+\\epsilon))$-approximation, is work-efficient, and has polylogarithmic span. \nIn addition, we implement these algorithms and propose optimizations. On a 60-core machine, we achieve 13.23-38.99x and 1.19-13.76x self-relative parallel speedup for $k$-clique counting and $k$-clique densest subgraph, respectively. Compared to the state-of-the-art parallel $k$-clique counting algorithms, we achieve a 1.31-9.88x speedup, and compared to existing implementations of $k$-clique densest subgraph, we achieve a 1.01-11.83x speedup. We are able to compute the $4$-clique counts on the largest publicly-available graph with over two hundred billion edges.","PeriodicalId":93610,"journal":{"name":"Proceedings of the 2021 SIAM Conference on Applied and Computational Discrete Algorithms. SIAM Conference on Applied and Computational Discrete Algorithms (2021 : Online)","volume":"24 1","pages":"135-146"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 SIAM Conference on Applied and Computational Discrete Algorithms. SIAM Conference on Applied and Computational Discrete Algorithms (2021 : Online)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/1.9781611976830.13","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 30

Abstract

Dense subgraphs capture strong communities in social networks and entities possessing strong interactions in biological networks. In particular, $k$-clique counting and listing have applications in identifying important actors in a graph. However, finding $k$-cliques is computationally expensive, and thus it is important to have fast parallel algorithms. We present a new parallel algorithm for $k$-clique counting that has polylogarithmic span and is work-efficient with respect to the well-known sequential algorithm for $k$-clique listing by Chiba and Nishizeki. Our algorithm can be extended to support listing and enumeration, and is based on computing low out-degree orientations. We present a new linear-work and polylogarithmic span algorithm for computing such orientations, and new parallel algorithms for producing unbiased estimations of clique counts. Finally, we design new parallel work-efficient algorithms for approximating the $k$-clique densest subgraph. Our first algorithm gives a $1/k$-approximation and is based on iteratively peeling vertices with the lowest clique counts; our algorithm is work-efficient, but we prove that this process is P-complete and hence does not have polylogarithmic span. Our second algorithm gives a $1/(k(1+\epsilon))$-approximation, is work-efficient, and has polylogarithmic span. In addition, we implement these algorithms and propose optimizations. On a 60-core machine, we achieve 13.23-38.99x and 1.19-13.76x self-relative parallel speedup for $k$-clique counting and $k$-clique densest subgraph, respectively. Compared to the state-of-the-art parallel $k$-clique counting algorithms, we achieve a 1.31-9.88x speedup, and compared to existing implementations of $k$-clique densest subgraph, we achieve a 1.01-11.83x speedup. We are able to compute the $4$-clique counts on the largest publicly-available graph with over two hundred billion edges.

查看原文本刊更多论文

并行团计数和剥离算法

密集子图捕获社会网络中的强社区和生物网络中具有强相互作用的实体。特别是，$k$-团计数和列表在识别图中的重要参与者方面具有应用。然而，查找$k$-团在计算上是昂贵的，因此拥有快速并行算法是很重要的。本文提出了一种新的$k$-clique计数并行算法，该算法具有多对数跨度，并且相对于Chiba和Nishizeki的著名的$k$-clique计数顺序算法具有更高的工作效率。我们的算法可以扩展到支持列表和枚举，并且基于计算低出度取向。我们提出了一种新的线性功和多对数跨度算法来计算这种方向，以及一种新的并行算法来产生团计数的无偏估计。最后，我们设计了新的并行高效算法来逼近k-团最密集子图。我们的第一个算法给出了$1/k$-近似，并且基于具有最低团计数的迭代剥离顶点;我们的算法是高效的，但我们证明了这个过程是p完全的，因此不具有多对数张成。我们的第二个算法给出了$1/(k(1+\epsilon))$-近似值，工作效率高，并且具有多对数跨度。此外，我们还实现了这些算法并提出了优化建议。在60核机器上，我们分别为$k$-clique计数和$k$-clique最密集子图实现了13.23-38.99倍和1.19-13.76倍的自相对并行加速。与最先进的并行$k$-clique计数算法相比，我们实现了1.31-9.88倍的加速，与现有的$k$-clique最密集子图实现相比，我们实现了1.01-11.83倍的加速。我们能够在拥有超过2000亿个边的最大的公开可用图上计算$4$-clique计数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2021 SIAM Conference on Applied and Computational Discrete Algorithms. SIAM Conference on Applied and Computational Discrete Algorithms (2021 : Online)

自引率

0.00%

发文量