Proceedings of the International Workshop on Semantic Big Data最新文献_第2页

SPARTI 巴达

Proceedings of the International Workshop on Semantic Big Data Pub Date : 2018-06-10 DOI: 10.1145/3208352.3208356

Amgad Madkour, Walid G. Aref, Ahmed M. Aly

{"title":"SPARTI","authors":"Amgad Madkour, Walid G. Aref, Ahmed M. Aly","doi":"10.1145/3208352.3208356","DOIUrl":"https://doi.org/10.1145/3208352.3208356","url":null,"abstract":"Semantic data is an integral component for search engines that provide answers beyond simple keyword-based matches. Resource Description Framework (RDF) provides a standardized and flexible graph model for representing semantic data. The astronomical growth of RDF data raises the need for scalable RDF management strategies. Although cloud-based systems provide a rich platform for managing large-scale RDF data, the shared storage provided by these systems introduces several performance challenges, e.g., disk I/O and network shuffling overhead. This paper investigates SPARTI, a scalable RDF data management system. In SPARTI, the partitioning of the data is based on the join patterns found in the query workload. Initially, SPARTI vertically partitions the RDF data, and then incrementally updates the partitioning according to the workload, which improves the query performance of frequent join patterns. SPARTI utilizes a partitioning schema, termed SemVP, that enables the system to read a reduced set of rows instead of entire partitions. SPARTI proposes a budgeting mechanism with a cost model to determine the worthiness of partitioning. Using real and synthetic datasets, SPARTI is compared against a Spark-based state-of-the-art system and is shown to execute queries around half the time over all query shapes while maintaining around an order of magnitude enhancement in storage requirements.","PeriodicalId":210506,"journal":{"name":"Proceedings of the International Workshop on Semantic Big Data","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114600973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Extending Apache Spark with a Mediation Layer 用中介层扩展Apache Spark

Proceedings of the International Workshop on Semantic Big Data Pub Date : 2018-06-10 DOI: 10.1145/3208352.3208354

Dimitris Stripelis, Chrysovalantis Anastasiou, J. Ambite

引用次数: 3

Timestamp-based Integrity Proofs for Linked Data 关联数据的基于时间戳的完整性证明

Proceedings of the International Workshop on Semantic Big Data Pub Date : 2018-06-10 DOI: 10.1145/3208352.3208353

Andrew Sutton, Reza Samavi

引用次数: 2

Stream WatDiv: A Streaming RDF Benchmark 流WatDiv:一个流RDF基准

Proceedings of the International Workshop on Semantic Big Data Pub Date : 2018-06-10 DOI: 10.1145/3208352.3208355

Libo Gao, Lukasz Golab, M. Tamer Özsu, Günes Aluç

引用次数: 10

TrueWeb TrueWeb

Proceedings of the International Workshop on Semantic Big Data Pub Date : 2018-06-10 DOI: 10.1145/3208352.3208357

Amgad Madkour, Walid G. Aref, Sunil Prabhakar, Mohamed S. Ali, Siarhei Bykau

引用次数: 2

Using semantic web technologies to power LungMAP, a molecular data repository 使用语义web技术为LungMAP(一个分子数据存储库)提供动力

Proceedings of the International Workshop on Semantic Big Data Pub Date : 2017-05-19 DOI: 10.1145/3066911.3066916

Michelle C. Krzyzanowski, Josh Levy, G. Page, N. Gaddis, R. Clark

引用次数: 2

Extracting linked data from statistic spreadsheets 从统计电子表格中提取关联数据

Proceedings of the International Workshop on Semantic Big Data Pub Date : 2017-05-19 DOI: 10.1145/3066911.3066914

Tien-Duc Cao, I. Manolescu, Xavier Tannier

引用次数: 17

On data placement strategies in distributed RDF stores 分布式RDF存储中的数据放置策略

Proceedings of the International Workshop on Semantic Big Data Pub Date : 2017-05-19 DOI: 10.1145/3066911.3066915

Daniel Janke, Steffen Staab, Matthias Thimm

引用次数: 7

Evolution of anatomical concept usage over time: mining 200 years of biodiversity literature 解剖学概念使用的演变:挖掘200年的生物多样性文献

Proceedings of the International Workshop on Semantic Big Data Pub Date : 2017-05-19 DOI: 10.1145/3066911.3066919

Prashanti Manda, T. Vision

引用次数: 0

A distributed graph approach for pre-processing linked RDF data using supercomputers 使用超级计算机预处理链接RDF数据的分布式图方法

Proceedings of the International Workshop on Semantic Big Data Pub Date : 2017-05-19 DOI: 10.1145/3066911.3066913

M. Lewis, G. Thiruvathukal, V. Vishwanath, M. Papka, Andrew E. Johnson

引用次数: 1