Efficient Multi-depth Querying on Provenance of Relational Queries Using Graph Database

Proceedings of the 9th Annual ACM India Conference Pub Date : 2016-10-21 DOI:10.1145/2998476.2998480

A. Rani, Navneet Goyal, S. Gadia

{"title":"Efficient Multi-depth Querying on Provenance of Relational Queries Using Graph Database","authors":"A. Rani, Navneet Goyal, S. Gadia","doi":"10.1145/2998476.2998480","DOIUrl":null,"url":null,"abstract":"Data Provenance is the history associated with that data. It constitutes the origin, creation, processing, and archiving of data. In today's Internet era, it has gained significant importance for database analytics. Most of the provenance models store provenance information in relational databases for further querying and analysis. Although, querying of provenance in Relational Databases is very efficient for small data sets, it becomes inefficient as the provenance data grows and traversal depth of provenance query increases. This is mainly due to increase in number of join operations to search the entire provenance data. Graph Databases provide an alternative to RDBMSs for storing and analyzing provenance data as it can scale to billions of nodes and at the same time traverse thousands of relationships efficiently. In this paper, we propose efficient multi-depth querying of provenance data using graph databases. The proposed solution allows efficient querying of provenance of current as well as historical queries. A comparison between relational and graph databases is presented for varying provenance data size and traversal depths. Graph databases are found to scale well with increasing depth of provenance queries, whereas in relational databases the querying time increases exponentially.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th Annual ACM India Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2998476.2998480","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Data Provenance is the history associated with that data. It constitutes the origin, creation, processing, and archiving of data. In today's Internet era, it has gained significant importance for database analytics. Most of the provenance models store provenance information in relational databases for further querying and analysis. Although, querying of provenance in Relational Databases is very efficient for small data sets, it becomes inefficient as the provenance data grows and traversal depth of provenance query increases. This is mainly due to increase in number of join operations to search the entire provenance data. Graph Databases provide an alternative to RDBMSs for storing and analyzing provenance data as it can scale to billions of nodes and at the same time traverse thousands of relationships efficiently. In this paper, we propose efficient multi-depth querying of provenance data using graph databases. The proposed solution allows efficient querying of provenance of current as well as historical queries. A comparison between relational and graph databases is presented for varying provenance data size and traversal depths. Graph databases are found to scale well with increasing depth of provenance queries, whereas in relational databases the querying time increases exponentially.

查看原文本刊更多论文

基于图数据库的关系查询来源的高效多深度查询

数据出处是与该数据相关联的历史记录。它构成了数据的起源、创建、处理和存档。在当今的互联网时代，它对数据库分析具有重要意义。大多数来源模型将来源信息存储在关系数据库中，以供进一步查询和分析。虽然关系型数据库中的来源查询对于小数据集是非常高效的，但随着来源数据的增长和来源查询的遍历深度的增加，它的效率会降低。这主要是由于搜索整个来源数据的连接操作数量的增加。图数据库为存储和分析来源数据提供了rdbms的替代方案，因为它可以扩展到数十亿个节点，同时有效地遍历数千个关系。本文提出了一种基于图数据库的高效的多深度溯源数据查询方法。提出的解决方案允许对当前和历史查询的来源进行有效的查询。针对不同的来源数据大小和遍历深度，对关系数据库和图数据库进行了比较。研究发现，随着来源查询深度的增加，图数据库具有良好的可扩展性，而在关系数据库中，查询时间呈指数增长。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 9th Annual ACM India Conference

自引率

0.00%

发文量