Discovery and Creation of Rich Entities for Knowledge Bases

A. Quamar, Fatma Özcan, Konstantinos Xirogiannopoulos
{"title":"Discovery and Creation of Rich Entities for Knowledge Bases","authors":"A. Quamar, Fatma Özcan, Konstantinos Xirogiannopoulos","doi":"10.1145/3214708.3214712","DOIUrl":null,"url":null,"abstract":"Businesses and professional organizations from a variety of different domains such as finance, weather, healthcare, social networks, etc., produce massive amounts of unstructured, semi-structured and structured data. Knowledge bases, enable querying and analysis of integrated content derived from such data available as open, third party and propriety data sets. Many knowledge bases today, provide an entity-centric view over the integrated content by using domain-specific ontologies. These entity-centric views enable querying individual real-world entities, as well as exploring exact information (such as address or net revenue of a company) through explicit querying using languages such as SQL or SPARQL. Although very useful for many business and commercial applications, this may not be sufficient for the exploration of relevant and context specific information associated with real-world entities stored in these knowledge bases. Users often need to resort to a manual and tedious process of exploration using ad-hoc queries to gather the required information. To enhance user experience and ameliorate the problem of relevant data exploration, we propose the concept of Rich Entities. These rich entities comprise of all the relevant and context specific information grouped together around real-world entities and served as efficient and meaningful responses to user queries against these entities in a knowledge base. These rich entities are created by grouping together information not only from a single entity represented as an ontology concept, but also related concepts and properties as specified by the domain ontology. In this paper we propose several novel techniques and algorithms to automatically detect, learn, and create domain-specific rich entities. We use inputs from query patterns in existing query workloads against knowledge bases, and leverage the structure and relationships between entities defined in the domain ontology. Our techniques are very effective and can be applied to a wide variety of application domains thus adding great value to data exploration and information extraction from entity-centric real-world knowledge bases.","PeriodicalId":93360,"journal":{"name":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","volume":"69 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Workshop on Exploratory Search in Databases and the Web. International Workshop on Exploratory Search in Databases and the Web (5th : 2018 : Houston, Tex.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3214708.3214712","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Businesses and professional organizations from a variety of different domains such as finance, weather, healthcare, social networks, etc., produce massive amounts of unstructured, semi-structured and structured data. Knowledge bases, enable querying and analysis of integrated content derived from such data available as open, third party and propriety data sets. Many knowledge bases today, provide an entity-centric view over the integrated content by using domain-specific ontologies. These entity-centric views enable querying individual real-world entities, as well as exploring exact information (such as address or net revenue of a company) through explicit querying using languages such as SQL or SPARQL. Although very useful for many business and commercial applications, this may not be sufficient for the exploration of relevant and context specific information associated with real-world entities stored in these knowledge bases. Users often need to resort to a manual and tedious process of exploration using ad-hoc queries to gather the required information. To enhance user experience and ameliorate the problem of relevant data exploration, we propose the concept of Rich Entities. These rich entities comprise of all the relevant and context specific information grouped together around real-world entities and served as efficient and meaningful responses to user queries against these entities in a knowledge base. These rich entities are created by grouping together information not only from a single entity represented as an ontology concept, but also related concepts and properties as specified by the domain ontology. In this paper we propose several novel techniques and algorithms to automatically detect, learn, and create domain-specific rich entities. We use inputs from query patterns in existing query workloads against knowledge bases, and leverage the structure and relationships between entities defined in the domain ontology. Our techniques are very effective and can be applied to a wide variety of application domains thus adding great value to data exploration and information extraction from entity-centric real-world knowledge bases.
知识库丰富实体的发现与创建
来自不同领域的企业和专业组织,如金融、天气、医疗保健、社交网络等,会产生大量的非结构化、半结构化和结构化数据。知识库,允许查询和分析从这些数据中获得的集成内容,这些数据可以作为开放的、第三方的和专有的数据集。如今,许多知识库通过使用特定于领域的本体,在集成的内容上提供以实体为中心的视图。这些以实体为中心的视图支持查询真实世界中的单个实体,以及通过使用SQL或SPARQL等语言进行显式查询来探索精确信息(例如公司的地址或净收入)。尽管对于许多业务和商业应用程序非常有用,但对于探索与存储在这些知识库中的现实世界实体相关的相关信息和特定于上下文的信息来说,这可能还不够。用户通常需要借助手动和繁琐的探索过程,使用特别的查询来收集所需的信息。为了增强用户体验和改善相关数据探索问题,我们提出了富实体的概念。这些丰富的实体包含围绕现实世界实体组合在一起的所有相关和特定于上下文的信息,并作为知识库中针对这些实体的用户查询的有效和有意义的响应。这些丰富的实体是通过将信息分组在一起创建的,这些信息不仅来自表示为本体概念的单个实体,还来自领域本体指定的相关概念和属性。在本文中,我们提出了几种新的技术和算法来自动检测、学习和创建特定于领域的富实体。我们对知识库使用现有查询工作负载中的查询模式输入,并利用领域本体中定义的实体之间的结构和关系。我们的技术非常有效,可以应用于各种各样的应用领域,从而为以实体为中心的现实世界知识库的数据探索和信息提取增加了巨大的价值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信