Join Processing for Graph Patterns: An Old Dog with New Tricks

Proceedings of the GRADES'15 Pub Date : 2015-03-13 DOI:10.1145/2764947.2764948

D. Nguyen, M. Aref, Martin Bravenboer, G. Kollias, H. Ngo, C. Ré, A. Rudra

{"title":"Join Processing for Graph Patterns: An Old Dog with New Tricks","authors":"D. Nguyen, M. Aref, Martin Bravenboer, G. Kollias, H. Ngo, C. Ré, A. Rudra","doi":"10.1145/2764947.2764948","DOIUrl":null,"url":null,"abstract":"Join optimization has been dominated by Selinger-style, pairwise optimizers for decades. But, Selinger-style algorithms are asymptotically suboptimal for applications in graphic analytics. This sub-optimality is one of the reasons that many have advocated supplementing relational engines with specialized graph processing engines. Recently, new join algorithms have been discovered that achieve optimal worst-case run times for any join or even so-called beyond worst-case (or instance optimal) run time guarantees for specialized classes of joins. These new algorithms match or improve on those used in specialized graph-processing systems. This paper asks can these new join algorithms allow relational engines to close the performance gap with graph engines? We examine this question for graph-pattern queries or join queries. We find that classical relational databases like Postgres and MonetDB or newer graph databases/stores like Virtuoso and Neo4j may be orders of magnitude slower than these new approaches compared to a fully featured RDBMS, LogicBlox, using these new ideas. Our results demonstrate that an RDBMS with such new algorithms can perform as well as specialized engines like GraphLab -- while retaining a high-level interface. We hope our work adds to the ongoing debate of the role of graph accelerators, new graph systems, and relational systems in modern workloads.","PeriodicalId":144860,"journal":{"name":"Proceedings of the GRADES'15","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"57","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the GRADES'15","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2764947.2764948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 57

Abstract

Join optimization has been dominated by Selinger-style, pairwise optimizers for decades. But, Selinger-style algorithms are asymptotically suboptimal for applications in graphic analytics. This sub-optimality is one of the reasons that many have advocated supplementing relational engines with specialized graph processing engines. Recently, new join algorithms have been discovered that achieve optimal worst-case run times for any join or even so-called beyond worst-case (or instance optimal) run time guarantees for specialized classes of joins. These new algorithms match or improve on those used in specialized graph-processing systems. This paper asks can these new join algorithms allow relational engines to close the performance gap with graph engines? We examine this question for graph-pattern queries or join queries. We find that classical relational databases like Postgres and MonetDB or newer graph databases/stores like Virtuoso and Neo4j may be orders of magnitude slower than these new approaches compared to a fully featured RDBMS, LogicBlox, using these new ideas. Our results demonstrate that an RDBMS with such new algorithms can perform as well as specialized engines like GraphLab -- while retaining a high-level interface. We hope our work adds to the ongoing debate of the role of graph accelerators, new graph systems, and relational systems in modern workloads.

查看原文本刊更多论文

图形模式的联接处理:有新花样的老狗

几十年来，连接优化一直由塞林格式的成对优化器主导。但是，塞林格式算法对于图形分析中的应用是渐近次优的。这种次优性是许多人提倡用专门的图处理引擎来补充关系引擎的原因之一。最近，已经发现了新的连接算法，可以实现任何连接的最优最坏情况运行时间，甚至对于专门的连接类，甚至可以实现所谓的超越最坏情况(或实例最优)运行时间保证。这些新算法匹配或改进了专用图形处理系统中使用的算法。本文的问题是，这些新的连接算法能否让关系引擎缩小与图引擎之间的性能差距?我们对图模式查询或连接查询检查这个问题。我们发现经典的关系型数据库，如Postgres和MonetDB，或较新的图形数据库/存储，如Virtuoso和Neo4j，与使用这些新思想的全功能RDBMS LogicBlox相比，这些新方法可能要慢几个数量级。我们的结果表明，具有这种新算法的RDBMS可以像GraphLab这样的专用引擎一样出色地执行，同时保留高级接口。我们希望我们的工作能够增加正在进行的关于图加速器、新图系统和关系系统在现代工作负载中的作用的辩论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the GRADES'15

自引率

0.00%

发文量