为越南依赖项解析构建一个树库

Thi Luong Nguyen, L. My, Viet Hung Nguyen, Huyen Thi Minh Nguyen, Hong Phuong Le
{"title":"为越南依赖项解析构建一个树库","authors":"Thi Luong Nguyen, L. My, Viet Hung Nguyen, Huyen Thi Minh Nguyen, Hong Phuong Le","doi":"10.1109/RIVF.2013.6719884","DOIUrl":null,"url":null,"abstract":"The problem of Vietnamese syntactic parsing, especially constituency parsing, has recently been tackled by several research groups. A common effort of the Vietnamese language processing community has allowed the creation of VietTreebank, a reference parsed corpus containing about 10,000 sentences for the constituency parsing task. In this paper, we present our work to build a reference treebank, based on VietTreebank, for the dependency parsing task, which has not yet been very well studied for Vietnamese. First we define a dependency label set by adapting the dependency schema developed by the NLP group at Stanford university and taking into account the particularities of Vietnamese grammar. Then we propose an algorithm to convert a constituency treebank to a dependency one. The algorithm is tested on a set of 100 sentences of VietTreebank corpus and gives very good results. Finally, we carry out an experiment on Vietnamese dependency parsing using MaltParser tool and the dependency treebank converted from VietTreebank.","PeriodicalId":121216,"journal":{"name":"The 2013 RIVF International Conference on Computing & Communication Technologies - Research, Innovation, and Vision for Future (RIVF)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":"{\"title\":\"Building a treebank for Vietnamese dependency parsing\",\"authors\":\"Thi Luong Nguyen, L. My, Viet Hung Nguyen, Huyen Thi Minh Nguyen, Hong Phuong Le\",\"doi\":\"10.1109/RIVF.2013.6719884\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The problem of Vietnamese syntactic parsing, especially constituency parsing, has recently been tackled by several research groups. A common effort of the Vietnamese language processing community has allowed the creation of VietTreebank, a reference parsed corpus containing about 10,000 sentences for the constituency parsing task. In this paper, we present our work to build a reference treebank, based on VietTreebank, for the dependency parsing task, which has not yet been very well studied for Vietnamese. First we define a dependency label set by adapting the dependency schema developed by the NLP group at Stanford university and taking into account the particularities of Vietnamese grammar. Then we propose an algorithm to convert a constituency treebank to a dependency one. The algorithm is tested on a set of 100 sentences of VietTreebank corpus and gives very good results. Finally, we carry out an experiment on Vietnamese dependency parsing using MaltParser tool and the dependency treebank converted from VietTreebank.\",\"PeriodicalId\":121216,\"journal\":{\"name\":\"The 2013 RIVF International Conference on Computing & Communication Technologies - Research, Innovation, and Vision for Future (RIVF)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"30\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 2013 RIVF International Conference on Computing & Communication Technologies - Research, Innovation, and Vision for Future (RIVF)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RIVF.2013.6719884\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 2013 RIVF International Conference on Computing & Communication Technologies - Research, Innovation, and Vision for Future (RIVF)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RIVF.2013.6719884","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30

摘要

越南语的句法分析问题,特别是选区分析问题,最近已经有几个研究小组着手解决。越南语处理社区的共同努力已经允许创建VietTreebank,这是一个包含大约10,000个句子的参考解析语料库,用于选区解析任务。在本文中,我们介绍了我们基于VietTreebank为依赖解析任务构建参考树库的工作,该任务在越南语中尚未得到很好的研究。首先,我们根据斯坦福大学NLP小组开发的依赖模式,并考虑到越南语语法的特殊性,定义了一个依赖标签集。然后提出了一种将选区树库转换为依赖树库的算法。该算法在100句的VietTreebank语料库上进行了测试,得到了很好的结果。最后,我们使用MaltParser工具和由VietTreebank转换而来的依赖树库进行了越南语依赖解析实验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Building a treebank for Vietnamese dependency parsing
The problem of Vietnamese syntactic parsing, especially constituency parsing, has recently been tackled by several research groups. A common effort of the Vietnamese language processing community has allowed the creation of VietTreebank, a reference parsed corpus containing about 10,000 sentences for the constituency parsing task. In this paper, we present our work to build a reference treebank, based on VietTreebank, for the dependency parsing task, which has not yet been very well studied for Vietnamese. First we define a dependency label set by adapting the dependency schema developed by the NLP group at Stanford university and taking into account the particularities of Vietnamese grammar. Then we propose an algorithm to convert a constituency treebank to a dependency one. The algorithm is tested on a set of 100 sentences of VietTreebank corpus and gives very good results. Finally, we carry out an experiment on Vietnamese dependency parsing using MaltParser tool and the dependency treebank converted from VietTreebank.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信