{"title":"Efficient Multi-depth Querying on Provenance of Relational Queries Using Graph Database","authors":"A. Rani, Navneet Goyal, S. Gadia","doi":"10.1145/2998476.2998480","DOIUrl":null,"url":null,"abstract":"Data Provenance is the history associated with that data. It constitutes the origin, creation, processing, and archiving of data. In today's Internet era, it has gained significant importance for database analytics. Most of the provenance models store provenance information in relational databases for further querying and analysis. Although, querying of provenance in Relational Databases is very efficient for small data sets, it becomes inefficient as the provenance data grows and traversal depth of provenance query increases. This is mainly due to increase in number of join operations to search the entire provenance data. Graph Databases provide an alternative to RDBMSs for storing and analyzing provenance data as it can scale to billions of nodes and at the same time traverse thousands of relationships efficiently. In this paper, we propose efficient multi-depth querying of provenance data using graph databases. The proposed solution allows efficient querying of provenance of current as well as historical queries. A comparison between relational and graph databases is presented for varying provenance data size and traversal depths. Graph databases are found to scale well with increasing depth of provenance queries, whereas in relational databases the querying time increases exponentially.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th Annual ACM India Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2998476.2998480","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Data Provenance is the history associated with that data. It constitutes the origin, creation, processing, and archiving of data. In today's Internet era, it has gained significant importance for database analytics. Most of the provenance models store provenance information in relational databases for further querying and analysis. Although, querying of provenance in Relational Databases is very efficient for small data sets, it becomes inefficient as the provenance data grows and traversal depth of provenance query increases. This is mainly due to increase in number of join operations to search the entire provenance data. Graph Databases provide an alternative to RDBMSs for storing and analyzing provenance data as it can scale to billions of nodes and at the same time traverse thousands of relationships efficiently. In this paper, we propose efficient multi-depth querying of provenance data using graph databases. The proposed solution allows efficient querying of provenance of current as well as historical queries. A comparison between relational and graph databases is presented for varying provenance data size and traversal depths. Graph databases are found to scale well with increasing depth of provenance queries, whereas in relational databases the querying time increases exponentially.