Co-reference Resolution in Vietnamese Documents Based on Support Vector Machines

2011 International Conference on Asian Language Processing Pub Date : 2011-11-15 DOI:10.1109/IALP.2011.63

Duc-Trong Le, Mai-Vu Tran, Tri-Thanh Nguyen, Quang-Thuy Ha

引用次数: 2

Abstract

Co-reference resolution task still poses many challenges due to the complexity of the Vietnamese language, and the lack of standard Vietnamese linguistic resources. Based on the mention-pair model of Rahman and Ng. (2009) and the characteristics of Vietnamese, this paper proposes a model using support vector machines (SVM) to solve the co-reference in Vietnamese documents. The corpus used in experiments to evaluate the proposed model was constructed from 200 articles in cultural and social categories from vnexpress.net newspaper website. The results of the initial experiments of the proposed model achieved 76.51% accuracy in comparison with that of the baseline model of 73.79% with similar features.

查看原文本刊更多论文

基于支持向量机的越南语文献共同参考解析

由于越南语的复杂性和缺乏标准的越南语语言资源，共同指称解析任务仍然面临许多挑战。基于Rahman和Ng的提及对模型。(2009)和越南语的特点，本文提出了一个使用支持向量机(SVM)的模型来解决越南语文档中的共同引用问题。实验中使用的语料库是由vexpress.net报纸网站上的200篇文化和社会类文章构建而成的。初步实验结果表明，该模型的准确率为76.51%，而相似特征的基线模型的准确率为73.79%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 International Conference on Asian Language Processing

自引率

0.00%

发文量