More Complete Resultset Retrieval from Large Heterogeneous RDF Sources

André Valdestilhas, Tommaso Soru, Muhammad Saleem
{"title":"More Complete Resultset Retrieval from Large Heterogeneous RDF Sources","authors":"André Valdestilhas, Tommaso Soru, Muhammad Saleem","doi":"10.1145/3360901.3364436","DOIUrl":null,"url":null,"abstract":"Over the last years, the Web of Data has grown significantly. Various interfaces such as LOD Stats, LOD Laudromat, SPARQL endpoints provide access to the hundered of thousands of RDF datasets, representing billions of facts. These datasets are available in different formats such as raw data dumps and HDT files or directly accessible via SPARQL endpoints. Querying such large amount of distributed data is particularly challenging and many of these datasets cannot be directly queried using the SPARQL query language. In order to tackle these problems, we present WimuQ, an integrated query engine to execute SPARQL queries and retrieve results from large amount of heterogeneous RDF data sources. Presently, WimuQ is able to execute both federated and non-federated SPARQL queries over a total of 668,166 datasets from LOD Stats and LOD Laudromat as well as 559 active SPARQL endpoints. These data sources represent a total of 221.7 billion triples from more than 5 terabytes of information from datasets retrieved using the service \"Where is My URI\" (WIMU). Our evaluation on state-of-the-art real-data benchmarks shows that WimuQ retrieves more complete results for the benchmark queries.","PeriodicalId":116830,"journal":{"name":"Proceedings of the 10th International Conference on Knowledge Capture","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 10th International Conference on Knowledge Capture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3360901.3364436","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Over the last years, the Web of Data has grown significantly. Various interfaces such as LOD Stats, LOD Laudromat, SPARQL endpoints provide access to the hundered of thousands of RDF datasets, representing billions of facts. These datasets are available in different formats such as raw data dumps and HDT files or directly accessible via SPARQL endpoints. Querying such large amount of distributed data is particularly challenging and many of these datasets cannot be directly queried using the SPARQL query language. In order to tackle these problems, we present WimuQ, an integrated query engine to execute SPARQL queries and retrieve results from large amount of heterogeneous RDF data sources. Presently, WimuQ is able to execute both federated and non-federated SPARQL queries over a total of 668,166 datasets from LOD Stats and LOD Laudromat as well as 559 active SPARQL endpoints. These data sources represent a total of 221.7 billion triples from more than 5 terabytes of information from datasets retrieved using the service "Where is My URI" (WIMU). Our evaluation on state-of-the-art real-data benchmarks shows that WimuQ retrieves more complete results for the benchmark queries.
从大型异构RDF源获取更完整的结果集
在过去的几年里,数据网络有了显著的发展。各种接口(如LOD Stats、LOD Laudromat、SPARQL端点)提供了对数十万个RDF数据集的访问,这些数据集代表了数十亿个事实。这些数据集有不同的格式,比如原始数据转储和HDT文件,或者可以通过SPARQL端点直接访问。查询如此大量的分布式数据尤其具有挑战性,其中许多数据集不能使用SPARQL查询语言直接查询。为了解决这些问题,我们提出了WimuQ,这是一个集成的查询引擎,用于执行SPARQL查询并从大量异构RDF数据源检索结果。目前,WimuQ能够对来自LOD Stats和LOD Laudromat以及559个活动SPARQL端点的总共668,166个数据集执行联邦和非联邦SPARQL查询。这些数据源总共代表2217亿个三元组,这些三元组来自使用“我的URI在哪里”(WIMU)服务检索的数据集中超过5tb的信息。我们对最先进的实际数据基准的评估表明,WimuQ为基准查询检索更完整的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信