缅甸语与迈耶克语的统计机器翻译

T. Oo, Ye Kyaw Thu, K. Soe, T. Supnithi
{"title":"缅甸语与迈耶克语的统计机器翻译","authors":"T. Oo, Ye Kyaw Thu, K. Soe, T. Supnithi","doi":"10.18178/wcse.2020.02.007","DOIUrl":null,"url":null,"abstract":"This paper contributes the first evaluation of the quality of machine translation between Myanmar and Myeik (also known as Beik) . We also developed a Myanmar-Myeik parallel corpus (around 10K sentences) based on the Myanmar language of ASEAN MT corpus. In addition, two types of segmentation were studied word and syllable segmentation. The 10 folds cross-validation experiments were carried out using three different statistical machine translation approaches: phrase-based, hierarchical phrasebased, and the operation sequence model (OSM). The results show that all three statistical machine translation approaches give higher and comparable BLEU and RIBES scores for both Myanmar to Myeik and Myeik to Myanmar machine translations. OSM approach achieved the highest BLEU and RIBES scores among three approaches. We also found that syllable segmentation is appropriate for translation quality comparing with word level segmentation results.","PeriodicalId":292895,"journal":{"name":"Proceedings of 2020 the 10th International Workshop on Computer Science and Engineering","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Statistical Machine Translation between Myanmar and Myeik\",\"authors\":\"T. Oo, Ye Kyaw Thu, K. Soe, T. Supnithi\",\"doi\":\"10.18178/wcse.2020.02.007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper contributes the first evaluation of the quality of machine translation between Myanmar and Myeik (also known as Beik) . We also developed a Myanmar-Myeik parallel corpus (around 10K sentences) based on the Myanmar language of ASEAN MT corpus. In addition, two types of segmentation were studied word and syllable segmentation. The 10 folds cross-validation experiments were carried out using three different statistical machine translation approaches: phrase-based, hierarchical phrasebased, and the operation sequence model (OSM). The results show that all three statistical machine translation approaches give higher and comparable BLEU and RIBES scores for both Myanmar to Myeik and Myeik to Myanmar machine translations. OSM approach achieved the highest BLEU and RIBES scores among three approaches. We also found that syllable segmentation is appropriate for translation quality comparing with word level segmentation results.\",\"PeriodicalId\":292895,\"journal\":{\"name\":\"Proceedings of 2020 the 10th International Workshop on Computer Science and Engineering\",\"volume\":\"83 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 2020 the 10th International Workshop on Computer Science and Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18178/wcse.2020.02.007\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 2020 the 10th International Workshop on Computer Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18178/wcse.2020.02.007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

本文首次对缅甸语与Myeik语(又称Beik语)的机器翻译质量进行了评价。我们还开发了一个基于东盟MT语料库缅甸语的缅甸语平行语料库(约10K个句子)。此外,还研究了分词和分音节两种分词方法。采用基于短语、基于分层短语和操作序列模型(OSM)三种不同的统计机器翻译方法进行10倍交叉验证实验。结果表明,所有三种统计机器翻译方法对缅甸语到Myeik语和Myeik语到缅甸语的机器翻译都给出了更高的BLEU和RIBES分数,并且具有可比性。OSM方法在三种方法中BLEU和RIBES得分最高。我们还发现音节分词比词级分词更适合翻译质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Statistical Machine Translation between Myanmar and Myeik
This paper contributes the first evaluation of the quality of machine translation between Myanmar and Myeik (also known as Beik) . We also developed a Myanmar-Myeik parallel corpus (around 10K sentences) based on the Myanmar language of ASEAN MT corpus. In addition, two types of segmentation were studied word and syllable segmentation. The 10 folds cross-validation experiments were carried out using three different statistical machine translation approaches: phrase-based, hierarchical phrasebased, and the operation sequence model (OSM). The results show that all three statistical machine translation approaches give higher and comparable BLEU and RIBES scores for both Myanmar to Myeik and Myeik to Myanmar machine translations. OSM approach achieved the highest BLEU and RIBES scores among three approaches. We also found that syllable segmentation is appropriate for translation quality comparing with word level segmentation results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信