SimGQ: Simultaneously Evaluating Iterative Graph Queries

2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC) Pub Date : 2020-12-01 DOI:10.1109/HiPC50609.2020.00014

Chengshuo Xu, Abbas Mazloumi, Xiaolin Jiang, Rajiv Gupta

{"title":"SimGQ: Simultaneously Evaluating Iterative Graph Queries","authors":"Chengshuo Xu, Abbas Mazloumi, Xiaolin Jiang, Rajiv Gupta","doi":"10.1109/HiPC50609.2020.00014","DOIUrl":null,"url":null,"abstract":"Graph processing frameworks are typically designed to optimize the evaluation of a single graph query. However, in practice, we often need to respond to multiple graph queries, either from different users or from a single user performing a complex analytics task. Therefore in this paper we develop SimGQ, a system that optimizes simultaneous evaluation of a group of vertex queries that originate at different source vertices (e.g., multiple shortest path queries originating at different source vertices) and delivers substantial speedups over a conventional framework that evaluates and responds to queries one by one. The performance benefits are achieved via batching and sharing. Batching fully utilizes system resources to evaluate a batch of queries and amortizes runtime overheads incurred due to fetching vertices and edge lists, synchronizing threads, and maintaining computation frontiers. Sharing dynamically identifies shared queries that substantially represent subcomputations in the evaluation of different queries in a batch, evaluates the shared queries, and then uses their results to accelerate the evaluation of all queries in the batch. With four input power-law graphs and four graph algorithms SimGQ achieves speedups of up to 45.67 × with batch sizes of up to 512 queries over the baseline implementation that evaluates the queries one by one using the state of the art Ligra system. Moreover, both batching and sharing contribute substantially to the speedups.","PeriodicalId":375004,"journal":{"name":"2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC50609.2020.00014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Graph processing frameworks are typically designed to optimize the evaluation of a single graph query. However, in practice, we often need to respond to multiple graph queries, either from different users or from a single user performing a complex analytics task. Therefore in this paper we develop SimGQ, a system that optimizes simultaneous evaluation of a group of vertex queries that originate at different source vertices (e.g., multiple shortest path queries originating at different source vertices) and delivers substantial speedups over a conventional framework that evaluates and responds to queries one by one. The performance benefits are achieved via batching and sharing. Batching fully utilizes system resources to evaluate a batch of queries and amortizes runtime overheads incurred due to fetching vertices and edge lists, synchronizing threads, and maintaining computation frontiers. Sharing dynamically identifies shared queries that substantially represent subcomputations in the evaluation of different queries in a batch, evaluates the shared queries, and then uses their results to accelerate the evaluation of all queries in the batch. With four input power-law graphs and four graph algorithms SimGQ achieves speedups of up to 45.67 × with batch sizes of up to 512 queries over the baseline implementation that evaluates the queries one by one using the state of the art Ligra system. Moreover, both batching and sharing contribute substantially to the speedups.

查看原文本刊更多论文

SimGQ:同时评估迭代图查询

图处理框架通常被设计为优化单个图查询的评估。然而，在实践中，我们经常需要响应来自不同用户或执行复杂分析任务的单个用户的多个图形查询。因此，在本文中，我们开发了SimGQ，这是一个系统，可以优化同时评估来自不同源顶点的一组顶点查询(例如，来自不同源顶点的多个最短路径查询)，并提供了比传统框架更大的速度，传统框架是一个接一个地评估和响应查询。性能优势是通过批处理和共享实现的。批处理充分利用系统资源来评估一批查询，并分摊由于获取顶点和边缘列表、同步线程和维护计算边界而产生的运行时开销。共享动态地识别实质上表示批处理中不同查询的子计算的共享查询，计算共享查询，然后使用它们的结果来加速批处理中所有查询的计算。使用四个输入幂律图和四个图算法，SimGQ在使用最先进的Ligra系统逐个评估查询的基线实现上实现了高达45.67倍的加速，批处理大小高达512个查询。此外，批处理和共享都对加速有很大贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)

自引率

0.00%

发文量