The Power of Pivoting for Exact Clique Counting

Proceedings of the 13th International Conference on Web Search and Data Mining Pub Date : 2020-01-19 DOI:10.1145/3336191.3371839

Shweta Jain, C. Seshadhri

{"title":"The Power of Pivoting for Exact Clique Counting","authors":"Shweta Jain, C. Seshadhri","doi":"10.1145/3336191.3371839","DOIUrl":null,"url":null,"abstract":"Clique counting is a fundamental task in network analysis, and even the simplest setting of $3$-cliques (triangles) has been the center of much recent research. Getting the count of k-cliques for larger k is algorithmically challenging, due to the exponential blowup in the search space of large cliques. But a number of recent applications (especially for community detection or clustering) use larger clique counts. Moreover, one often desires local counts, the number of k-cliques per vertex/edge. Our main contribution is Pivoter, an algorithm that exactly counts the number of k-cliques, for all values of k. It is surprisingly effective in practice, and is able to get clique counts of graphs that were beyond the reach of previous work. For example, Pivoter gets all clique counts in a social network with a 100M edges within two hours on a commodity machine. Previous parallel algorithms do not terminate in days. Pivoter can also feasibly get local per-vertex and per-edge k-clique counts (for all k) for many public data sets with tens of millions of edges. To the best of our knowledge, this is the first algorithm that achieves such results. The main insight is the construction of a Succinct Clique Tree (SCT) that stores a compressed unique representation of all cliques in an input graph. It is built using a technique called pivoting, a classic approach by Bron-Kerbosch to reduce the recursion tree of backtracking algorithms for maximal cliques. Remarkably, the SCT can be built without actually enumerating all cliques, and provides a succinct data structure from which exact clique statistics (k-clique counts, local counts) can be read off efficiently.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"93 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3336191.3371839","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 26

Abstract

Clique counting is a fundamental task in network analysis, and even the simplest setting of $3$-cliques (triangles) has been the center of much recent research. Getting the count of k-cliques for larger k is algorithmically challenging, due to the exponential blowup in the search space of large cliques. But a number of recent applications (especially for community detection or clustering) use larger clique counts. Moreover, one often desires local counts, the number of k-cliques per vertex/edge. Our main contribution is Pivoter, an algorithm that exactly counts the number of k-cliques, for all values of k. It is surprisingly effective in practice, and is able to get clique counts of graphs that were beyond the reach of previous work. For example, Pivoter gets all clique counts in a social network with a 100M edges within two hours on a commodity machine. Previous parallel algorithms do not terminate in days. Pivoter can also feasibly get local per-vertex and per-edge k-clique counts (for all k) for many public data sets with tens of millions of edges. To the best of our knowledge, this is the first algorithm that achieves such results. The main insight is the construction of a Succinct Clique Tree (SCT) that stores a compressed unique representation of all cliques in an input graph. It is built using a technique called pivoting, a classic approach by Bron-Kerbosch to reduce the recursion tree of backtracking algorithms for maximal cliques. Remarkably, the SCT can be built without actually enumerating all cliques, and provides a succinct data structure from which exact clique statistics (k-clique counts, local counts) can be read off efficiently.

查看原文本刊更多论文

精确集团计数的旋转力量

派系计数是网络分析中的一项基本任务，即使是最简单的$3$-派系(三角形)设置也已成为最近许多研究的中心。对于较大的k，获得k-cliques的计数在算法上是具有挑战性的，因为大型cliques的搜索空间呈指数膨胀。但是最近的一些应用程序(特别是用于社区检测或集群)使用更大的派系计数。此外，人们经常需要局部计数，即每个顶点/边的k个团的数量。我们的主要贡献是Pivoter，这是一种精确计算k-cliques数量的算法，对于所有k值。它在实践中非常有效，并且能够获得超出以前工作范围的图的clique计数。例如，Pivoter可以在两小时内在一台商品机器上获得100M边缘的社交网络中的所有派系计数。以前的并行算法不会在几天内终止。对于许多拥有数千万条边的公共数据集，Pivoter也可以获得局部每个顶点和每个边的k-clique计数(对于所有k)。据我们所知，这是第一个实现这种结果的算法。主要的见解是构造一个简洁的Clique Tree (SCT)，它在输入图中存储所有Clique的压缩唯一表示。它是使用一种称为pivot的技术构建的，这是brown - kerbosch的一种经典方法，用于减少最大团的回溯算法的递归树。值得注意的是，SCT可以在不实际枚举所有团的情况下构建，并提供了一个简洁的数据结构，可以有效地读取精确的团统计信息(k-团计数，局部计数)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 13th International Conference on Web Search and Data Mining

自引率

0.00%

发文量