基于中越双语词对齐的越南语依存树库构建

Ying Li, Jianyi Guo, Zhengtao Yu, Hongbin Wang, Yonghua Wen
{"title":"基于中越双语词对齐的越南语依存树库构建","authors":"Ying Li, Jianyi Guo, Zhengtao Yu, Hongbin Wang, Yonghua Wen","doi":"10.1109/FSKD.2016.7603371","DOIUrl":null,"url":null,"abstract":"Treebank is one of important resources in the natural language processing. Compared with the rich and mature Chinese corpus, Vietnamese Syntactic Analysis is much more difficult. This paper presents a new approach which uses Chinese-Vietnamese bilingual word alignment corpus to build Vietnamese Dependency Treebank. Firstly, the aligned word processing was made by Chinese-Vietnamese sentence alignment; Secondly, the dependency parsing was done with Chinese sentences. Finally, Vietnamese Dependency Parsing Treebank was generated by Chinese-Vietnamese Languages align relationship and Chinese Dependency Tree, At the same time, The Vietnamese phrase tree converted into dependency Treebank can significantly improve the accuracy of dependency analysis. Experimental results show that this approach can simplify the process of manual collection and annotation of Vietnamese Treebank, and it can save manpower and time to build the Vietnamese Treebank. Experimental results show that the accuracy of this method compared to machine learning methods has improved significantly.","PeriodicalId":373155,"journal":{"name":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Building vietnamese dependency treebank based on Chinese-Vietnamese bilingual word alignment\",\"authors\":\"Ying Li, Jianyi Guo, Zhengtao Yu, Hongbin Wang, Yonghua Wen\",\"doi\":\"10.1109/FSKD.2016.7603371\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Treebank is one of important resources in the natural language processing. Compared with the rich and mature Chinese corpus, Vietnamese Syntactic Analysis is much more difficult. This paper presents a new approach which uses Chinese-Vietnamese bilingual word alignment corpus to build Vietnamese Dependency Treebank. Firstly, the aligned word processing was made by Chinese-Vietnamese sentence alignment; Secondly, the dependency parsing was done with Chinese sentences. Finally, Vietnamese Dependency Parsing Treebank was generated by Chinese-Vietnamese Languages align relationship and Chinese Dependency Tree, At the same time, The Vietnamese phrase tree converted into dependency Treebank can significantly improve the accuracy of dependency analysis. Experimental results show that this approach can simplify the process of manual collection and annotation of Vietnamese Treebank, and it can save manpower and time to build the Vietnamese Treebank. Experimental results show that the accuracy of this method compared to machine learning methods has improved significantly.\",\"PeriodicalId\":373155,\"journal\":{\"name\":\"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FSKD.2016.7603371\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FSKD.2016.7603371","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

树库是自然语言处理中的重要资源之一。与丰富而成熟的汉语语料库相比,越南语的句法分析难度要大得多。本文提出了一种利用中越双语词对齐语料库构建越南语依存树库的新方法。首先,采用中越句子对齐方法进行对齐字处理;其次,对汉语句子进行依存句法分析。最后,利用中越语言对齐关系和中文依存树生成越南语依存解析树库,同时将越南语短语树转换为依存树库,可以显著提高依存分析的准确性。实验结果表明,该方法简化了手工收集和标注越南语树库的过程,节省了越南语树库构建的人力和时间。实验结果表明,与机器学习方法相比,该方法的准确率有了明显提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Building vietnamese dependency treebank based on Chinese-Vietnamese bilingual word alignment
Treebank is one of important resources in the natural language processing. Compared with the rich and mature Chinese corpus, Vietnamese Syntactic Analysis is much more difficult. This paper presents a new approach which uses Chinese-Vietnamese bilingual word alignment corpus to build Vietnamese Dependency Treebank. Firstly, the aligned word processing was made by Chinese-Vietnamese sentence alignment; Secondly, the dependency parsing was done with Chinese sentences. Finally, Vietnamese Dependency Parsing Treebank was generated by Chinese-Vietnamese Languages align relationship and Chinese Dependency Tree, At the same time, The Vietnamese phrase tree converted into dependency Treebank can significantly improve the accuracy of dependency analysis. Experimental results show that this approach can simplify the process of manual collection and annotation of Vietnamese Treebank, and it can save manpower and time to build the Vietnamese Treebank. Experimental results show that the accuracy of this method compared to machine learning methods has improved significantly.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信