Extracting Domain-Relevant Term Using Wikipedia Based on Random Walk Model

2012 Seventh ChinaGrid Annual Conference Pub Date : 2012-09-20 DOI:10.1109/CHINAGRID.2012.20

Wenjuan Wu, Tao Liu, H. Hu, Xiaoyong Du

引用次数: 5

Abstract

In this paper we present a new approach for the automatic identification of domain-relevant concepts and entities of a given domain using the category and page structures of the Wikipedia in a language independent way. By applying Markov random walk algorithm on the weighted Wikipedia link graph, our approach can identify large quantities of domain-relevant concepts and entities with very little human effort. Experimental results show that our method achieves high accuracy and acceptable efficiency in domain-relevant term extraction.

查看原文本刊更多论文

基于随机游走模型的维基百科领域相关术语提取

在本文中，我们提出了一种新的方法来自动识别领域相关的概念和实体的给定领域使用维基百科的类别和页面结构的语言独立的方式。通过在加权维基百科链接图上应用马尔可夫随机漫步算法，我们的方法可以用很少的人力来识别大量与领域相关的概念和实体。实验结果表明，该方法在领域相关术语提取方面具有较高的准确性和可接受的效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 Seventh ChinaGrid Annual Conference

自引率

0.00%

发文量