Fact Discovery in Wikipedia

IEEE/WIC/ACM International Conference on Web Intelligence (WI'07) Pub Date : 2007-11-01 DOI:10.1109/WI.2007.57

S. F. Adafre, V. Jijkoun, M. de Rijke

引用次数: 16

Abstract

We address the task of extracting focused salient information items, relevant and important for a given topic, from a large encyclopedic resource. Specifically, for a given topic (a Wikipedia article) we identify snippets from other articles in Wikipedia that contain important information for the topic of the original article, without duplicates. We compare several methods for addressing the task, and find that a mixture of content-based, link-based, and layout-based features outperforms other methods, especially in combination with the use of so-called reference corpora that capture the key properties of entities of a common type.

查看原文本刊更多论文

维基百科中的事实发现

我们解决了从大型百科全书资源中提取与给定主题相关且重要的重点突出信息项的任务。具体来说，对于给定的主题(维基百科文章)，我们从维基百科的其他文章中识别片段，这些片段包含原始文章主题的重要信息，没有重复。我们比较了解决该任务的几种方法，发现基于内容、基于链接和基于布局的混合特征优于其他方法，特别是与使用所谓的参考语料库结合使用时，该语料库捕获了常见类型实体的关键属性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE/WIC/ACM International Conference on Web Intelligence (WI'07)

自引率

0.00%

发文量