K. Schlegel, F. Stegmaier, Sebastian P. Bayerl, M. Granitzer, H. Kosch
{"title":"Balloon Fusion: SPARQL rewriting based on unified co-reference information","authors":"K. Schlegel, F. Stegmaier, Sebastian P. Bayerl, M. Granitzer, H. Kosch","doi":"10.1109/ICDEW.2014.6818335","DOIUrl":null,"url":null,"abstract":"While Linked Open Data showed enormous increase in volume, yet there is no single point of access for querying the over 200 SPARQL repositories. In this paper we present Balloon Fusion, a SPARQL 1.1 rewriting and query federation service build on crawling and consolidating co-reference relationships in over 100 reachable Linked Data SPARQL Endpoints. The results of this process are 17.6M co-reference statements that have been clustered to 8.4M distinct semantic entities and are now accessible as download for further analysis. The proposed SPARQL rewriting performs a substitution of all URI occurrences with their synonyms combined with an automatic endpoint selection based on URI origin for a comprehensive query federation. While we show the technical feasibility, we also critically reflect the current status of the Linked Open Data cloud: although it is huge in size, access via SPARQL Endpoints is complicated in most cases due to missing quality of service.","PeriodicalId":302600,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering Workshops","volume":"73 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 30th International Conference on Data Engineering Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDEW.2014.6818335","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
While Linked Open Data showed enormous increase in volume, yet there is no single point of access for querying the over 200 SPARQL repositories. In this paper we present Balloon Fusion, a SPARQL 1.1 rewriting and query federation service build on crawling and consolidating co-reference relationships in over 100 reachable Linked Data SPARQL Endpoints. The results of this process are 17.6M co-reference statements that have been clustered to 8.4M distinct semantic entities and are now accessible as download for further analysis. The proposed SPARQL rewriting performs a substitution of all URI occurrences with their synonyms combined with an automatic endpoint selection based on URI origin for a comprehensive query federation. While we show the technical feasibility, we also critically reflect the current status of the Linked Open Data cloud: although it is huge in size, access via SPARQL Endpoints is complicated in most cases due to missing quality of service.
虽然Linked Open Data的容量有了巨大的增长,但是对于查询200多个SPARQL存储库还没有一个单一的访问点。在本文中,我们介绍了气球融合,这是一个SPARQL 1.1重写和查询联合服务,建立在抓取和巩固100多个可访问的关联数据SPARQL端点中的共同引用关系的基础上。这个过程的结果是1760万条共同引用语句被聚类到840万条不同的语义实体,现在可以下载以供进一步分析。提议的SPARQL重写将所有出现的URI替换为同义词,并结合基于URI起源的自动端点选择,以实现全面的查询联合。在展示技术可行性的同时,我们也批判性地反映了关联开放数据云的现状:尽管规模巨大,但由于缺乏服务质量,在大多数情况下,通过SPARQL端点进行访问是复杂的。