From information to knowledge: harvesting entities and relationships from web sources

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems Pub Date : 2010-06-06 DOI:10.1145/1807085.1807097

G. Weikum, M. Theobald

引用次数: 160

Abstract

There are major trends to advance the functionality of search engines to a more expressive semantic level. This is enabled by the advent of knowledge-sharing communities such as Wikipedia and the progress in automatically extracting entities and relationships from semistructured as well as natural-language Web sources. Recent endeavors of this kind include DBpedia, EntityCube, KnowItAll, ReadTheWeb, and our own YAGO-NAGA project (and others). The goal is to automatically construct and maintain a comprehensive knowledge base of facts about named entities, their semantic classes, and their mutual relations as well as temporal contexts, with high precision and high recall. This tutorial discusses state-of-the-art methods, research opportunities, and open challenges along this avenue of knowledge harvesting.

查看原文本刊更多论文

从信息到知识:从web资源中获取实体和关系

目前的主要趋势是将搜索引擎的功能提升到更具表达性的语义水平。这得益于维基百科等知识共享社区的出现，以及从半结构化和自然语言Web资源中自动提取实体和关系的进展。最近的这类努力包括DBpedia、EntityCube、KnowItAll、ReadTheWeb和我们自己的YAGO-NAGA项目(以及其他项目)。目标是自动构建和维护一个关于命名实体、它们的语义类、它们的相互关系以及时间上下文的全面的事实知识库，具有高精度和高召回率。本教程讨论了沿着这条知识获取途径的最先进的方法、研究机会和开放的挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

CiteScore

4.40

自引率

0.00%

发文量