使用超级计算机预处理链接RDF数据的分布式图方法

Proceedings of the International Workshop on Semantic Big Data Pub Date : 2017-05-19 DOI:10.1145/3066911.3066913

M. Lewis, G. Thiruvathukal, V. Vishwanath, M. Papka, Andrew E. Johnson

{"title":"使用超级计算机预处理链接RDF数据的分布式图方法","authors":"M. Lewis, G. Thiruvathukal, V. Vishwanath, M. Papka, Andrew E. Johnson","doi":"10.1145/3066911.3066913","DOIUrl":null,"url":null,"abstract":"Efficient RDF, graph based queries are becoming more pertinent based on the increased interest in data analytics and its intersection with large, unstructured but connected data. Many commercial systems have adopted distributed RDF graph systems in order to handle increasing dataset sizes and complex queries. This paper introduces a distribute graph approach to pre-processing linked data. Instead of traversing the memory graph, our system indexes pre-processed join elements that are organized in a graph structure. We analyze the Dbpedia data-set (derived from the Wikipedia corpus) and compare our access method to the graph traversal access approach which we also devise. Results show from our experiments that the distributed, pre-processed graph approach to accessing linked data is faster than the traversal approach over a specific range of linked queries.","PeriodicalId":210506,"journal":{"name":"Proceedings of the International Workshop on Semantic Big Data","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A distributed graph approach for pre-processing linked RDF data using supercomputers\",\"authors\":\"M. Lewis, G. Thiruvathukal, V. Vishwanath, M. Papka, Andrew E. Johnson\",\"doi\":\"10.1145/3066911.3066913\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Efficient RDF, graph based queries are becoming more pertinent based on the increased interest in data analytics and its intersection with large, unstructured but connected data. Many commercial systems have adopted distributed RDF graph systems in order to handle increasing dataset sizes and complex queries. This paper introduces a distribute graph approach to pre-processing linked data. Instead of traversing the memory graph, our system indexes pre-processed join elements that are organized in a graph structure. We analyze the Dbpedia data-set (derived from the Wikipedia corpus) and compare our access method to the graph traversal access approach which we also devise. Results show from our experiments that the distributed, pre-processed graph approach to accessing linked data is faster than the traversal approach over a specific range of linked queries.\",\"PeriodicalId\":210506,\"journal\":{\"name\":\"Proceedings of the International Workshop on Semantic Big Data\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the International Workshop on Semantic Big Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3066911.3066913\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Workshop on Semantic Big Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3066911.3066913","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

高效的RDF、基于图的查询正变得越来越有针对性，这是基于对数据分析的兴趣的增加，以及它与大型、非结构化但连接的数据的交集。许多商业系统采用分布式RDF图系统来处理不断增长的数据集大小和复杂的查询。本文介绍了一种分布式图方法对关联数据进行预处理。我们的系统不是遍历内存图，而是索引在图结构中组织的预处理连接元素。我们分析了Dbpedia数据集(来自Wikipedia语料库)，并将我们的访问方法与我们设计的图遍历访问方法进行了比较。我们的实验结果表明，在特定范围的链接查询中，分布式、预处理的图方法访问链接数据的速度比遍历方法快。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A distributed graph approach for pre-processing linked RDF data using supercomputers

Efficient RDF, graph based queries are becoming more pertinent based on the increased interest in data analytics and its intersection with large, unstructured but connected data. Many commercial systems have adopted distributed RDF graph systems in order to handle increasing dataset sizes and complex queries. This paper introduces a distribute graph approach to pre-processing linked data. Instead of traversing the memory graph, our system indexes pre-processed join elements that are organized in a graph structure. We analyze the Dbpedia data-set (derived from the Wikipedia corpus) and compare our access method to the graph traversal access approach which we also devise. Results show from our experiments that the distributed, pre-processed graph approach to accessing linked data is faster than the traversal approach over a specific range of linked queries.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the International Workshop on Semantic Big Data

自引率

0.00%

发文量