Resource Efficiency to Partition Big Streamed Graphs

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing Pub Date : 2015-06-29 DOI:10.1109/ISPDC.2015.21

Víctor Medel Gracia, Unai Arronategui Arribalzaga

{"title":"Resource Efficiency to Partition Big Streamed Graphs","authors":"Víctor Medel Gracia, Unai Arronategui Arribalzaga","doi":"10.1109/ISPDC.2015.21","DOIUrl":null,"url":null,"abstract":"Real time streaming and processing of big graphs is a relevant and challenging application to be executed in a Cloud infrastructure. We have analysed the amount of resources needed to partition large streamed graphs with different distributed architectures. We have improved state of the art limitations proposing a decentralised and scalable model which is more efficient in memory usage, network traffic and number of processing machines. The improvement has been achieved summarising incoming vertices of the graph and accessing to local information of the already partitioned graph. Classical approaches need all information about the previous vertices. In our system, local information is updated in a feedback scheme periodically. Our experimental results show that current architectures cannot process large scale streamed graphs due to memory limitations. We have proved that our architecture reduces the number of needed machines by seven because it accesses to local memory instead of a distributed one. The total memory size has been also reduced. Finally, our model allows to adjust the quality of the partition solution to the desired amount of memory and network traffic.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"38 1","pages":"120-129"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPDC.2015.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Real time streaming and processing of big graphs is a relevant and challenging application to be executed in a Cloud infrastructure. We have analysed the amount of resources needed to partition large streamed graphs with different distributed architectures. We have improved state of the art limitations proposing a decentralised and scalable model which is more efficient in memory usage, network traffic and number of processing machines. The improvement has been achieved summarising incoming vertices of the graph and accessing to local information of the already partitioned graph. Classical approaches need all information about the previous vertices. In our system, local information is updated in a feedback scheme periodically. Our experimental results show that current architectures cannot process large scale streamed graphs due to memory limitations. We have proved that our architecture reduces the number of needed machines by seven because it accesses to local memory instead of a distributed one. The total memory size has been also reduced. Finally, our model allows to adjust the quality of the partition solution to the desired amount of memory and network traffic.

查看原文本刊更多论文

大流图分区的资源效率

大图形的实时流和处理是在云基础设施中执行的一个相关且具有挑战性的应用程序。我们分析了用不同的分布式架构划分大型流图所需的资源量。我们改进了最先进的限制，提出了一个分散和可扩展的模型，在内存使用、网络流量和处理机器数量方面更有效。改进实现了对图的传入顶点的汇总和对已划分图的局部信息的访问。经典方法需要关于前面顶点的所有信息。在我们的系统中，局部信息以反馈方式定期更新。我们的实验结果表明，由于内存限制，当前架构无法处理大规模流图。我们已经证明，我们的架构将所需的机器数量减少了7台，因为它访问的是本地内存，而不是分布式内存。总的内存大小也减少了。最后，我们的模型允许调整分区解决方案的质量以适应所需的内存和网络流量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing

自引率

0.00%

发文量