Monitoring Properties of Large, Distributed, Dynamic Graphs

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2017-05-01 DOI:10.1109/IPDPS.2017.123

Gal Yehuda, D. Keren, Islam Akaria

{"title":"Monitoring Properties of Large, Distributed, Dynamic Graphs","authors":"Gal Yehuda, D. Keren, Islam Akaria","doi":"10.1109/IPDPS.2017.123","DOIUrl":null,"url":null,"abstract":"The following is a very common question in numerous theoretical and application-related domains: given a graph G, does it satisfy some given property? For example, is G connected? Is its diameter smaller than a given threshold? Is its average degree larger than a certain threshold? Traditionally, algorithms to quickly answer such questions were developed for static and centralized graphs (i.e. G is stored in a central server and the list of its vertices and edges is static and quickly accessible). Later, as dictated by practical considerations, a great deal of attention was given to on-line algorithms for dynamic graphs (where vertices and edges can be added and deleted); the focus of research was to quickly decide whether the new graph still satisfies the given property. Today, a more difficult version of this problem, referred to as the distributed monitoring problem, is becoming increasingly important: large graphs are not only dynamic, but also distributed, that is, G is partitioned between a few servers, none of which \"sees\" G in its entirety. The question is how to define local conditions, such that as long as they hold on the local graphs, it is guaranteed that the desired property holds for the global G. Such local conditions are crucial for avoiding a huge communication overhead. While defining local conditions for linear properties (e.g. average degree) is relatively easy, they are considerably more difficult to derive for non-linear functions over graphs. We propose a solution and a general definition of solution optimality, and demonstrate how to apply it to two important graph properties – the spectral gap and the number of triangles. We also define an absolute lower bound on the communication overhead for distributed monitoring, and compare our algorithm to it, with excellent results. Last but not least, performance improves as the graph becomes larger and denser – that is, when distributing it is more important.","PeriodicalId":209524,"journal":{"name":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2017.123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

The following is a very common question in numerous theoretical and application-related domains: given a graph G, does it satisfy some given property? For example, is G connected? Is its diameter smaller than a given threshold? Is its average degree larger than a certain threshold? Traditionally, algorithms to quickly answer such questions were developed for static and centralized graphs (i.e. G is stored in a central server and the list of its vertices and edges is static and quickly accessible). Later, as dictated by practical considerations, a great deal of attention was given to on-line algorithms for dynamic graphs (where vertices and edges can be added and deleted); the focus of research was to quickly decide whether the new graph still satisfies the given property. Today, a more difficult version of this problem, referred to as the distributed monitoring problem, is becoming increasingly important: large graphs are not only dynamic, but also distributed, that is, G is partitioned between a few servers, none of which "sees" G in its entirety. The question is how to define local conditions, such that as long as they hold on the local graphs, it is guaranteed that the desired property holds for the global G. Such local conditions are crucial for avoiding a huge communication overhead. While defining local conditions for linear properties (e.g. average degree) is relatively easy, they are considerably more difficult to derive for non-linear functions over graphs. We propose a solution and a general definition of solution optimality, and demonstrate how to apply it to two important graph properties – the spectral gap and the number of triangles. We also define an absolute lower bound on the communication overhead for distributed monitoring, and compare our algorithm to it, with excellent results. Last but not least, performance improves as the graph becomes larger and denser – that is, when distributing it is more important.

查看原文本刊更多论文

大型、分布式、动态图的监控属性

以下是一个在许多理论和应用相关领域中非常常见的问题:给定一个图G，它是否满足某些给定的性质?例如，G是否连通?它的直径是否小于给定的阈值?它的平均度数是否大于某个阈值?传统上，快速回答这些问题的算法是为静态和集中式图开发的(即G存储在中央服务器中，其顶点和边的列表是静态的，可以快速访问)。后来，由于实际考虑的需要，大量的注意力被给予了动态图的在线算法(其中顶点和边可以添加和删除);研究的重点是快速判断新图是否仍然满足给定的性质。今天，这个问题的一个更困难的版本，被称为分布式监控问题，正变得越来越重要:大型图不仅是动态的，而且是分布式的，也就是说，G被划分在几个服务器之间，没有一个服务器可以完整地“看到”G。问题是如何定义局部条件，这样只要它们在局部图上成立，就可以保证所需的属性在全局g上成立。这样的局部条件对于避免巨大的通信开销至关重要。虽然定义线性性质的局部条件(例如平均度)相对容易，但对于图上的非线性函数来说，它们要推导出来要困难得多。我们提出了一个解和解最优性的一般定义，并演示了如何将其应用于两个重要的图属性-谱间隙和三角形的数量。我们还定义了分布式监控的通信开销的绝对下界，并将我们的算法与之进行比较，结果非常好。最后但并非最不重要的一点是，性能会随着图变得更大更密集而提高——也就是说，当分布更重要时。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量