Post-analysis of Keyword-Based Search Results Using Entity Mining, Linked Data, and Link Analysis at Query Time

2014 IEEE International Conference on Semantic Computing Pub Date : 2014-06-16 DOI:10.1109/ICSC.2014.11

P. Fafalios, Yannis Tzitzikas

{"title":"Post-analysis of Keyword-Based Search Results Using Entity Mining, Linked Data, and Link Analysis at Query Time","authors":"P. Fafalios, Yannis Tzitzikas","doi":"10.1109/ICSC.2014.11","DOIUrl":null,"url":null,"abstract":"The integration of the classical Web (of documents) with the emerging Web of Data is a challenging vision. In this paper we focus on an integration approach during searching which aims at enriching the responses of non-semantic search systems (e.g. professional search systems, web search engines) with semantic information, i.e. Linked Open Data (LOD), and exploiting the outcome for providing an overview of the search space and allowing the users (apart from restricting it) to explore the related LOD. We use named entities (e.g. persons, locations, etc.) as the \"glue\" for automatically connecting search hits with LOD. We consider a scenario where this entity-based integration is performed at query time with no human effort, and no a-priori indexing, which is beneficial in terms of configurability and freshness. To realize this scenario one has to tackle various challenges. One spiny issue is that the number of identified entities can be high, the same is true for the semantic information about these entities that can be fetched from the available LOD (i.e. their properties and associations with other entities). To this end, in this paper we propose a Link Analysis-based method which is used for (a) ranking (and thus selecting to show) the more important semantic information related to the search results, (b) deriving and showing top-K semantic graphs. In the sequel, we report the results of a survey regarding the marine domain with promising results, and comparative results that illustrate the effectiveness of the proposed (Page Rank-based) ranking scheme. Finally, we report experimental results regarding efficiency showing that the proposed functionality can be offered even at query time.","PeriodicalId":175352,"journal":{"name":"2014 IEEE International Conference on Semantic Computing","volume":"753 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Semantic Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSC.2014.11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 18

Abstract

The integration of the classical Web (of documents) with the emerging Web of Data is a challenging vision. In this paper we focus on an integration approach during searching which aims at enriching the responses of non-semantic search systems (e.g. professional search systems, web search engines) with semantic information, i.e. Linked Open Data (LOD), and exploiting the outcome for providing an overview of the search space and allowing the users (apart from restricting it) to explore the related LOD. We use named entities (e.g. persons, locations, etc.) as the "glue" for automatically connecting search hits with LOD. We consider a scenario where this entity-based integration is performed at query time with no human effort, and no a-priori indexing, which is beneficial in terms of configurability and freshness. To realize this scenario one has to tackle various challenges. One spiny issue is that the number of identified entities can be high, the same is true for the semantic information about these entities that can be fetched from the available LOD (i.e. their properties and associations with other entities). To this end, in this paper we propose a Link Analysis-based method which is used for (a) ranking (and thus selecting to show) the more important semantic information related to the search results, (b) deriving and showing top-K semantic graphs. In the sequel, we report the results of a survey regarding the marine domain with promising results, and comparative results that illustrate the effectiveness of the proposed (Page Rank-based) ranking scheme. Finally, we report experimental results regarding efficiency showing that the proposed functionality can be offered even at query time.

查看原文本刊更多论文

使用实体挖掘、关联数据和查询时链接分析的基于关键字的搜索结果事后分析

将经典的(文档的)Web与新兴的数据Web集成是一个具有挑战性的愿景。在本文中，我们专注于搜索过程中的集成方法，该方法旨在用语义信息(即链接开放数据(LOD))丰富非语义搜索系统(例如专业搜索系统，web搜索引擎)的响应，并利用结果提供搜索空间的概述，并允许用户(除了限制它)探索相关的LOD。我们使用命名实体(例如人员、位置等)作为“粘合剂”，将搜索结果与LOD自动连接起来。我们考虑这样一个场景，即在查询时执行基于实体的集成，无需人工操作，也无需先验索引，这在可配置性和新鲜度方面是有益的。要实现这一设想，我们必须应对各种挑战。一个棘手的问题是，已识别实体的数量可能很高，从可用LOD中获取这些实体的语义信息(即它们的属性和与其他实体的关联)也是如此。为此，在本文中，我们提出了一种基于链接分析的方法，该方法用于(a)对与搜索结果相关的更重要的语义信息进行排序(从而选择显示)，(b)导出并显示top-K语义图。在续文中，我们报告了一项关于海洋领域的调查结果，结果很有希望，并比较了结果，说明了所提出的(基于页面排名的)排名方案的有效性。最后，我们报告了关于效率的实验结果，表明所提出的功能甚至可以在查询时提供。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 IEEE International Conference on Semantic Computing

自引率

0.00%

发文量