A Chinese Efficient Analyser Integrating Word Segmentation, Part-Of-Speech Tagging, Partial Parsing and Full Parsing

Workshop on Chinese Language Processing Pub Date : 2003-07-11 DOI:10.3115/1119250.1119261

Guodong Zhou, Jian Su

引用次数: 9

Abstract

This paper introduces an efficient analyser for the Chinese language, which efficiently and effectively integrates word segmentation, part-of-speech tagging, partial parsing and full parsing. The Chinese efficient analyser is based on a Hidden Markov Model (HMM) and an HMM-based tagger. That is, all the components are based on the same HMM-based tagging engine. One advantage of using the same single engine is that it largely decreases the code size and makes the maintenance easy. Another advantage is that it is easy to optimise the code and thus improve the speed while speed plays a critical important role in many applications. Finally, the performances of all the components can benefit from the optimisation of existing algorithms and/or adoption of better algorithms to a single engine. Experiments show that all the components can achieve state-of-art performances with high efficiency for the Chinese language.

查看原文本刊更多论文

一种集分词、词性标注、部分句法分析和完全句法分析于一体的汉语高效句法分析器

本文介绍了一种高效的汉语句法分析器，它将分词、词性标注、部分句法分析和全部句法分析高效地集成在一起。中国高效分析器是基于隐马尔可夫模型(HMM)和基于HMM的标注器。也就是说，所有组件都基于相同的基于hmm的标记引擎。使用相同的单个引擎的一个优点是，它在很大程度上减少了代码大小，使维护变得容易。另一个优点是，它很容易优化代码，从而提高速度，而速度在许多应用程序中起着至关重要的作用。最后，所有组件的性能都可以从现有算法的优化和/或对单个引擎采用更好的算法中受益。实验结果表明，所有组件都能达到汉语语言的最高性能和高效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Workshop on Chinese Language Processing

自引率

0.00%

发文量