Parallel Five-cycle Counting Algorithms

Q2 Mathematics

Journal of Experimental Algorithmics Pub Date : 2022-08-16 DOI:10.1145/3556541

Jessica Shi, Louisa Ruixue Huang, Julian Shun

{"title":"Parallel Five-cycle Counting Algorithms","authors":"Jessica Shi, Louisa Ruixue Huang, Julian Shun","doi":"10.1145/3556541","DOIUrl":null,"url":null,"abstract":"Counting the frequency of subgraphs in large networks is a classic research question that reveals the underlying substructures of these networks for important applications. However, subgraph counting is a challenging problem, even for subgraph sizes as small as five, due to the combinatorial explosion in the number of possible occurrences. This article focuses on the five-cycle, which is an important special case of five-vertex subgraph counting and one of the most difficult to count efficiently. We design two new parallel five-cycle counting algorithms and prove that they are work efficient and achieve polylogarithmic span. Both algorithms are based on computing low out-degree orientations, which enables the efficient computation of directed two-paths and three-paths, and the algorithms differ in the ways in which they use this orientation to eliminate double-counting. Additionally, we present new parallel algorithms for obtaining unbiased estimates of five-cycle counts using graph sparsification. We develop fast multicore implementations of the algorithms and propose a work scheduling optimization to improve their performance. Our experiments on a variety of real-world graphs using a 36-core machine with two-way hyper-threading show that our best exact parallel algorithm achieves 10–46× self-relative speedup, outperforms our serial benchmarks by 10–32×, and outperforms the previous state-of-the-art serial algorithm by up to 818×. Our best approximate algorithm, for a reasonable probability parameter, achieves up to 20× self-relative speedup and is able to approximate five-cycle counts 9–189× faster than our best exact algorithm, with between 0.52% and 11.77% error.","PeriodicalId":53707,"journal":{"name":"Journal of Experimental Algorithmics","volume":"27 1","pages":"1 - 23"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Experimental Algorithmics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3556541","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Mathematics","Score":null,"Total":0}

引用次数: 0

Abstract

Counting the frequency of subgraphs in large networks is a classic research question that reveals the underlying substructures of these networks for important applications. However, subgraph counting is a challenging problem, even for subgraph sizes as small as five, due to the combinatorial explosion in the number of possible occurrences. This article focuses on the five-cycle, which is an important special case of five-vertex subgraph counting and one of the most difficult to count efficiently. We design two new parallel five-cycle counting algorithms and prove that they are work efficient and achieve polylogarithmic span. Both algorithms are based on computing low out-degree orientations, which enables the efficient computation of directed two-paths and three-paths, and the algorithms differ in the ways in which they use this orientation to eliminate double-counting. Additionally, we present new parallel algorithms for obtaining unbiased estimates of five-cycle counts using graph sparsification. We develop fast multicore implementations of the algorithms and propose a work scheduling optimization to improve their performance. Our experiments on a variety of real-world graphs using a 36-core machine with two-way hyper-threading show that our best exact parallel algorithm achieves 10–46× self-relative speedup, outperforms our serial benchmarks by 10–32×, and outperforms the previous state-of-the-art serial algorithm by up to 818×. Our best approximate algorithm, for a reasonable probability parameter, achieves up to 20× self-relative speedup and is able to approximate five-cycle counts 9–189× faster than our best exact algorithm, with between 0.52% and 11.77% error.

查看原文本刊更多论文

并行五循环计数算法

计算大型网络中子图的频率是一个经典的研究问题，它揭示了这些网络在重要应用中的底层子结构。然而，由于可能出现的次数的组合爆炸，子图计数是一个具有挑战性的问题，即使对于小到五个的子图来说也是如此。本文重点讨论了五循环，它是五顶点子图计数的一个重要特例，也是最难有效计数的特例之一。我们设计了两种新的并行五循环计数算法，并证明了它们的工作效率和实现多对数跨度。这两种算法都是基于计算低阶方向的，这使得能够有效地计算有向两条路径和三条路径，并且两种算法在使用该方向来消除重复计数的方式上有所不同。此外，我们还提出了新的并行算法，用于使用图稀疏化获得五个循环计数的无偏估计。我们开发了算法的快速多核实现，并提出了一种工作调度优化来提高它们的性能。我们使用具有双向超线程的36核机器在各种真实世界的图形上进行的实验表明，我们最好的精确并行算法实现了10-46倍的自相对加速，比我们的串行基准高出10-32倍，比以前最先进的串行算法高出818倍。对于合理的概率参数，我们的最佳近似算法实现了高达20倍的自相对加速，并且能够比我们的最佳精确算法快9–189倍近似五个循环计数，误差在0.52%至11.77%之间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Experimental Algorithmics Mathematics-Theoretical Computer Science

CiteScore

3.10

自引率

0.00%

发文量

期刊介绍： The ACM JEA is a high-quality, refereed, archival journal devoted to the study of discrete algorithms and data structures through a combination of experimentation and classical analysis and design techniques. It focuses on the following areas in algorithms and data structures: ■combinatorial optimization ■computational biology ■computational geometry ■graph manipulation ■graphics ■heuristics ■network design ■parallel processing ■routing and scheduling ■searching and sorting ■VLSI design