Semantic search on summarized RDF triples

2017 International Conference on Intelligent Computing and Control (I2C2) Pub Date : 2017-06-01 DOI:10.1109/I2C2.2017.8321904

P. Gayathri, V. Rajendran

{"title":"Semantic search on summarized RDF triples","authors":"P. Gayathri, V. Rajendran","doi":"10.1109/I2C2.2017.8321904","DOIUrl":null,"url":null,"abstract":"Information is for the most part found inside databases of some kind. Coordinating these information would give advantages to the associations that claim these information. RDF is a general recommendation dialect for the Web, binding together information from different sources. SPARQL, a query language for RDF, can join information from various databases. Querying a huge RDF informational collection is to a great degree time consuming. Submitting a SPARQL query to a more promising subgraph will increase speed and reduce search space. For this, RDF summarization is done. Existing systems either use graph-based techniques for summarizing RDF or divide RDF triples simply based on its elements. RDF though possessing a graph like structure, cannot be expected to have every features of graph structure. And simple partitioning based on triple elements is also inefficient. In this paper, RDF dataset is first partitioned based on predicate similarity. These partitions are clustered based on semantic relatedness between predicates, so that more similar triples come in a single cluster. The RDF cluster graphs thus obtained are stored in Jena Tuple DataBase(TDB) as named graphs. SPARQL querying, is done on this named graph collection. A list of models that the SPARQL query require is obtained from index and querying is done on this union of graphs. The proposed algorithm is faster as search space is reduced and is also scalable.","PeriodicalId":288351,"journal":{"name":"2017 International Conference on Intelligent Computing and Control (I2C2)","volume":"42 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Intelligent Computing and Control (I2C2)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/I2C2.2017.8321904","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Information is for the most part found inside databases of some kind. Coordinating these information would give advantages to the associations that claim these information. RDF is a general recommendation dialect for the Web, binding together information from different sources. SPARQL, a query language for RDF, can join information from various databases. Querying a huge RDF informational collection is to a great degree time consuming. Submitting a SPARQL query to a more promising subgraph will increase speed and reduce search space. For this, RDF summarization is done. Existing systems either use graph-based techniques for summarizing RDF or divide RDF triples simply based on its elements. RDF though possessing a graph like structure, cannot be expected to have every features of graph structure. And simple partitioning based on triple elements is also inefficient. In this paper, RDF dataset is first partitioned based on predicate similarity. These partitions are clustered based on semantic relatedness between predicates, so that more similar triples come in a single cluster. The RDF cluster graphs thus obtained are stored in Jena Tuple DataBase(TDB) as named graphs. SPARQL querying, is done on this named graph collection. A list of models that the SPARQL query require is obtained from index and querying is done on this union of graphs. The proposed algorithm is faster as search space is reduced and is also scalable.

查看原文本刊更多论文

摘要RDF三元组的语义搜索

信息大部分是在某种数据库中找到的。对这些信息进行协调将有利于声称拥有这些信息的协会。RDF是一种用于Web的通用推荐方言，它将来自不同来源的信息绑定在一起。SPARQL是RDF的查询语言，它可以连接来自不同数据库的信息。查询庞大的RDF信息集合非常耗时。向更有前途的子图提交SPARQL查询将提高速度并减少搜索空间。为此，完成了RDF摘要。现有的系统要么使用基于图的技术来总结RDF，要么简单地根据RDF的元素划分RDF三元组。RDF虽然具有类似图的结构，但不能期望它具有图结构的所有特征。基于三元组元素的简单划分也是低效的。本文首先基于谓词相似度对RDF数据集进行了划分。这些分区是根据谓词之间的语义相关性进行聚类的，因此在一个聚类中会出现更多相似的三元组。由此获得的RDF聚类图作为命名图存储在Jena Tuple DataBase(TDB)中。SPARQL查询是在这个命名的图集合上完成的。SPARQL查询所需的模型列表是从索引中获得的，查询是在这个图的并集上完成的。该算法的速度更快，因为它减少了搜索空间，并且具有可扩展性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 International Conference on Intelligent Computing and Control (I2C2)

自引率

0.00%

发文量