Monitoring Properties of Large, Distributed, Dynamic Graphs

Gal Yehuda, D. Keren, Islam Akaria
{"title":"Monitoring Properties of Large, Distributed, Dynamic Graphs","authors":"Gal Yehuda, D. Keren, Islam Akaria","doi":"10.1109/IPDPS.2017.123","DOIUrl":null,"url":null,"abstract":"The following is a very common question in numerous theoretical and application-related domains: given a graph G, does it satisfy some given property? For example, is G connected? Is its diameter smaller than a given threshold? Is its average degree larger than a certain threshold? Traditionally, algorithms to quickly answer such questions were developed for static and centralized graphs (i.e. G is stored in a central server and the list of its vertices and edges is static and quickly accessible). Later, as dictated by practical considerations, a great deal of attention was given to on-line algorithms for dynamic graphs (where vertices and edges can be added and deleted); the focus of research was to quickly decide whether the new graph still satisfies the given property. Today, a more difficult version of this problem, referred to as the distributed monitoring problem, is becoming increasingly important: large graphs are not only dynamic, but also distributed, that is, G is partitioned between a few servers, none of which \"sees\" G in its entirety. The question is how to define local conditions, such that as long as they hold on the local graphs, it is guaranteed that the desired property holds for the global G. Such local conditions are crucial for avoiding a huge communication overhead. While defining local conditions for linear properties (e.g. average degree) is relatively easy, they are considerably more difficult to derive for non-linear functions over graphs. We propose a solution and a general definition of solution optimality, and demonstrate how to apply it to two important graph properties – the spectral gap and the number of triangles. We also define an absolute lower bound on the communication overhead for distributed monitoring, and compare our algorithm to it, with excellent results. Last but not least, performance improves as the graph becomes larger and denser – that is, when distributing it is more important.","PeriodicalId":209524,"journal":{"name":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2017.123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

The following is a very common question in numerous theoretical and application-related domains: given a graph G, does it satisfy some given property? For example, is G connected? Is its diameter smaller than a given threshold? Is its average degree larger than a certain threshold? Traditionally, algorithms to quickly answer such questions were developed for static and centralized graphs (i.e. G is stored in a central server and the list of its vertices and edges is static and quickly accessible). Later, as dictated by practical considerations, a great deal of attention was given to on-line algorithms for dynamic graphs (where vertices and edges can be added and deleted); the focus of research was to quickly decide whether the new graph still satisfies the given property. Today, a more difficult version of this problem, referred to as the distributed monitoring problem, is becoming increasingly important: large graphs are not only dynamic, but also distributed, that is, G is partitioned between a few servers, none of which "sees" G in its entirety. The question is how to define local conditions, such that as long as they hold on the local graphs, it is guaranteed that the desired property holds for the global G. Such local conditions are crucial for avoiding a huge communication overhead. While defining local conditions for linear properties (e.g. average degree) is relatively easy, they are considerably more difficult to derive for non-linear functions over graphs. We propose a solution and a general definition of solution optimality, and demonstrate how to apply it to two important graph properties – the spectral gap and the number of triangles. We also define an absolute lower bound on the communication overhead for distributed monitoring, and compare our algorithm to it, with excellent results. Last but not least, performance improves as the graph becomes larger and denser – that is, when distributing it is more important.
大型、分布式、动态图的监控属性
以下是一个在许多理论和应用相关领域中非常常见的问题:给定一个图G,它是否满足某些给定的性质?例如,G是否连通?它的直径是否小于给定的阈值?它的平均度数是否大于某个阈值?传统上,快速回答这些问题的算法是为静态和集中式图开发的(即G存储在中央服务器中,其顶点和边的列表是静态的,可以快速访问)。后来,由于实际考虑的需要,大量的注意力被给予了动态图的在线算法(其中顶点和边可以添加和删除);研究的重点是快速判断新图是否仍然满足给定的性质。今天,这个问题的一个更困难的版本,被称为分布式监控问题,正变得越来越重要:大型图不仅是动态的,而且是分布式的,也就是说,G被划分在几个服务器之间,没有一个服务器可以完整地“看到”G。问题是如何定义局部条件,这样只要它们在局部图上成立,就可以保证所需的属性在全局g上成立。这样的局部条件对于避免巨大的通信开销至关重要。虽然定义线性性质的局部条件(例如平均度)相对容易,但对于图上的非线性函数来说,它们要推导出来要困难得多。我们提出了一个解和解最优性的一般定义,并演示了如何将其应用于两个重要的图属性-谱间隙和三角形的数量。我们还定义了分布式监控的通信开销的绝对下界,并将我们的算法与之进行比较,结果非常好。最后但并非最不重要的一点是,性能会随着图变得更大更密集而提高——也就是说,当分布更重要时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信