Text mining wikipedia to discover alternative destinations

The 2013 10th International Joint Conference on Computer Science and Software Engineering (JCSSE) Pub Date : 2013-05-29 DOI:10.1109/JCSSE.2013.6567317

K. Cosh

引用次数: 2

Abstract

This paper discusses an application of some statistical Natural Language Processing algorithms to a set of articles from Wikipedia about top tourist destinations. The objective is to automatically identify the key features of each destination and then discover other destinations which share similar sets of features. Through this a method is demonstrated by which meta data about each article can be extracted from the unstructured text and then used to answer complex discovery type queries. The paper compares an approach to automatically clustering similar destinations with a more user driven feature focused technique.

查看原文本刊更多论文

文本挖掘维基百科发现替代目的地

本文讨论了一些统计自然语言处理算法在维基百科关于顶级旅游目的地的一组文章中的应用。目标是自动识别每个目的地的关键特征，然后发现具有相似特征集的其他目的地。通过这种方法演示了从非结构化文本中提取每篇文章的元数据，然后用于回答复杂的发现类型查询的方法。本文比较了一种自动聚类相似目的地的方法和一种更注重用户驱动特征的技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

The 2013 10th International Joint Conference on Computer Science and Software Engineering (JCSSE)

自引率

0.00%

发文量