The retrieval research of non-adjacent keywords in Chinese corpus — A case study of “Yi…Jiu…” construction

2014 International Conference on Asian Language Processing (IALP) Pub Date : 2014-12-04 DOI:10.1109/IALP.2014.6973507

Xiao-ru Tan, Lijiao Yang

引用次数: 0

Abstract

Corpus Concordancing is a popular research topic. The function of retrieving data from corpus by providing non-adjacent keywords is widely used by users. However, the precision of retrieval results is not very high because the machine can't recognize the relationship of the non-adjacent keywords. To deal with this problem, this paper proposed a rule-based method for the “Yi...Jiu...” construction, which could exclude the unrelated data, even though the data include the keywords. The experiments show that the precision is close to 82%.

查看原文本刊更多论文

汉语语料库中非相邻关键词检索研究——以“一……九……”结构为例

语料库检索是一个热门的研究课题。通过提供非相邻关键字从语料库中检索数据的功能被用户广泛使用。然而，由于机器不能识别非相邻关键词之间的关系，检索结果的精度不是很高。为了解决这一问题，本文提出了一种基于规则的“一……九……”构造，可以排除不相关的数据，即使数据中包含了关键词。实验表明，该方法的精度接近82%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 International Conference on Asian Language Processing (IALP)

自引率

0.00%

发文量