Recursive Query Evaluation in a Column DBMS to Analyze Large Graphs

C. Ordonez, Achyuth Gurram, N. Rai
{"title":"Recursive Query Evaluation in a Column DBMS to Analyze Large Graphs","authors":"C. Ordonez, Achyuth Gurram, N. Rai","doi":"10.1145/2666158.2666177","DOIUrl":null,"url":null,"abstract":"Graphs represent a major challenge on big data analytics, for which there are many systems and prototypes, most of them not based on relational database management systems (DBMSs). Graph problems require substantially different algorithms compared to other analytical techniques (i.e., cubes, statistical models, machine learning) and they are especially important in the analysis of social networks and the Internet. On the other hand, recursive queries are a fundamental query mechanism to analyze graphs in a DBMS, but they can be slow with large graphs. Column DBMSs are a novel kind of faster database systems, but with significantly different storage and retrieval mechanisms compared to traditional row DBMSs. Thus we study the pros and cons of optimizing recursive queries on a column DBMS. Specifically, we study two inter-related graph problems: transitive closure and adjacency matrix multiplication, together with their respective optimization of queries combining recursive joins and recursive aggregations. An experimental evaluation with large graphs compares query optimization in a column DBMS and a row DBMS. We analyze performance tradeoffs with graphs having significantly different size, shape and connectivity. Our benchmark results prove column DBMSs are much faster than row DBMSs to analyze graphs, especially as graphs get larger and denser.","PeriodicalId":335396,"journal":{"name":"International Workshop on Data Warehousing and OLAP","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Data Warehousing and OLAP","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2666158.2666177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Graphs represent a major challenge on big data analytics, for which there are many systems and prototypes, most of them not based on relational database management systems (DBMSs). Graph problems require substantially different algorithms compared to other analytical techniques (i.e., cubes, statistical models, machine learning) and they are especially important in the analysis of social networks and the Internet. On the other hand, recursive queries are a fundamental query mechanism to analyze graphs in a DBMS, but they can be slow with large graphs. Column DBMSs are a novel kind of faster database systems, but with significantly different storage and retrieval mechanisms compared to traditional row DBMSs. Thus we study the pros and cons of optimizing recursive queries on a column DBMS. Specifically, we study two inter-related graph problems: transitive closure and adjacency matrix multiplication, together with their respective optimization of queries combining recursive joins and recursive aggregations. An experimental evaluation with large graphs compares query optimization in a column DBMS and a row DBMS. We analyze performance tradeoffs with graphs having significantly different size, shape and connectivity. Our benchmark results prove column DBMSs are much faster than row DBMSs to analyze graphs, especially as graphs get larger and denser.
递归查询评估在列DBMS分析大图
图代表了大数据分析的主要挑战,有许多系统和原型,其中大多数不是基于关系数据库管理系统(dbms)。与其他分析技术(如立方体、统计模型、机器学习)相比,图问题需要截然不同的算法,它们在分析社交网络和互联网时尤为重要。另一方面,递归查询是分析DBMS中的图的基本查询机制,但是对于大型图,它们可能很慢。列dbms是一种新型的更快的数据库系统,但是与传统的行dbms相比,它的存储和检索机制有很大不同。因此,我们研究了在列DBMS上优化递归查询的利弊。具体来说,我们研究了两个相互关联的图问题:传递闭包和邻接矩阵乘法,以及它们各自结合递归连接和递归聚合的查询优化。对大型图的实验评估比较了列DBMS和行DBMS中的查询优化。我们分析具有显著不同大小、形状和连接的图的性能权衡。我们的基准测试结果证明,在分析图时,列dbms比行dbms要快得多,尤其是当图变得更大更密集时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信