Adaptive Partitioning for Large-Scale Dynamic Graphs

2014 IEEE 34th International Conference on Distributed Computing Systems Pub Date : 2013-09-04 DOI:10.1145/2523616.2525943

Luis M. Vaquero, F. Cuadrado, Dionysios Logothetis, Claudio Martella

{"title":"Adaptive Partitioning for Large-Scale Dynamic Graphs","authors":"Luis M. Vaquero, F. Cuadrado, Dionysios Logothetis, Claudio Martella","doi":"10.1145/2523616.2525943","DOIUrl":null,"url":null,"abstract":"In the last years, large-scale graph processing has gained increasing attention, with most recent systems placing particular emphasis on latency. One possible technique to improve runtime performance in a distributed graph processing system is to reduce network communication. The most notable way to achieve this goal is to partition the graph by minimizing the number of edges that connect vertices assigned to different machines, while keeping the load balanced. However, real-world graphs are highly dynamic, with vertices and edges being constantly added and removed. Carefully updating the partitioning of the graph to reflect these changes is necessary to avoid the introduction of an extensive number of cut edges, which would gradually worsen computation performance. In this paper we show that performance degradation in dynamic graph processing systems can be avoided by adapting continuously the graph partitions as the graph changes. We present a novel highly scalable adaptive partitioning strategy, and show a number of refinements that make it work under the constraints of a large-scale distributed system. The partitioning strategy is based on iterative vertex migrations, relying only on local information. We have implemented the technique in a graph processing system, and we show through three real-world scenarios how adapting graph partitioning reduces execution time by over 50% when compared to commonly used hash-partitioning.","PeriodicalId":170186,"journal":{"name":"2014 IEEE 34th International Conference on Distributed Computing Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"65","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 34th International Conference on Distributed Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2523616.2525943","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 65

Abstract

In the last years, large-scale graph processing has gained increasing attention, with most recent systems placing particular emphasis on latency. One possible technique to improve runtime performance in a distributed graph processing system is to reduce network communication. The most notable way to achieve this goal is to partition the graph by minimizing the number of edges that connect vertices assigned to different machines, while keeping the load balanced. However, real-world graphs are highly dynamic, with vertices and edges being constantly added and removed. Carefully updating the partitioning of the graph to reflect these changes is necessary to avoid the introduction of an extensive number of cut edges, which would gradually worsen computation performance. In this paper we show that performance degradation in dynamic graph processing systems can be avoided by adapting continuously the graph partitions as the graph changes. We present a novel highly scalable adaptive partitioning strategy, and show a number of refinements that make it work under the constraints of a large-scale distributed system. The partitioning strategy is based on iterative vertex migrations, relying only on local information. We have implemented the technique in a graph processing system, and we show through three real-world scenarios how adapting graph partitioning reduces execution time by over 50% when compared to commonly used hash-partitioning.

查看原文本刊更多论文

大规模动态图的自适应分区

在过去的几年里，大规模的图形处理得到了越来越多的关注，最近的系统特别强调延迟。提高分布式图处理系统运行时性能的一种可能技术是减少网络通信。实现这一目标的最显著方法是通过最小化连接分配给不同机器的顶点的边的数量来划分图，同时保持负载平衡。然而，现实世界的图形是高度动态的，顶点和边缘不断地被添加和删除。仔细更新图的分区以反映这些变化是必要的，以避免引入大量的切割边，这将逐渐降低计算性能。在本文中，我们证明了动态图处理系统的性能下降可以通过随着图的变化不断调整图分区来避免。我们提出了一种新的高度可扩展的自适应分区策略，并展示了一些改进，使其在大规模分布式系统的约束下工作。分区策略基于迭代顶点迁移，仅依赖于局部信息。我们已经在一个图处理系统中实现了该技术，并通过三个实际场景展示了与常用的散列分区相比，采用图分区可以减少50%以上的执行时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 IEEE 34th International Conference on Distributed Computing Systems

自引率

0.00%

发文量