Phuoc Tran, V. Duong, Dinh Dien, Bay Vo, Huu Nguyen, Long H. B. Nguyen
{"title":"Projecting dependency syntax labels from English into Vietnamese in English-Vietnamese bilingual corpus","authors":"Phuoc Tran, V. Duong, Dinh Dien, Bay Vo, Huu Nguyen, Long H. B. Nguyen","doi":"10.1504/ijiids.2020.10030209","DOIUrl":null,"url":null,"abstract":"In natural language processing, the corpora play an important role, particularly labelled corpora, such as labelled part-of-speech corpora, labelled component syntax corpora, and labelled dependency syntax corpora. These labelled corpora are used for corpus-based research and give higher quality results than the non-labelled. In this paper, we have conducted a Vietnamese dependency label tagger based on English-Vietnamese bilingual corpus, in which English was tagged with dependency labels. The experimental results show that our method produces a high tagging result with LAS measurement of 73.5% and UAS measurement of 81.7%.","PeriodicalId":39658,"journal":{"name":"International Journal of Intelligent Information and Database Systems","volume":"9 1","pages":"17-32"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Information and Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijiids.2020.10030209","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Abstract
In natural language processing, the corpora play an important role, particularly labelled corpora, such as labelled part-of-speech corpora, labelled component syntax corpora, and labelled dependency syntax corpora. These labelled corpora are used for corpus-based research and give higher quality results than the non-labelled. In this paper, we have conducted a Vietnamese dependency label tagger based on English-Vietnamese bilingual corpus, in which English was tagged with dependency labels. The experimental results show that our method produces a high tagging result with LAS measurement of 73.5% and UAS measurement of 81.7%.
期刊介绍:
Intelligent information systems and intelligent database systems are a very dynamically developing field in computer sciences. IJIIDS provides a medium for exchanging scientific research and technological achievements accomplished by the international community. It focuses on research in applications of advanced intelligent technologies for data storing and processing in a wide-ranging context. The issues addressed by IJIIDS involve solutions of real-life problems, in which it is necessary to apply intelligent technologies for achieving effective results. The emphasis of the reported work is on new and original research and technological developments rather than reports on the application of existing technology to different sets of data.