爪哇语语音级别机器翻译:基于不可能对限制的改进并行文本对齐

A. Wibawa, A. Nafalski, Wayah Firdaus Mahmudy
{"title":"爪哇语语音级别机器翻译:基于不可能对限制的改进并行文本对齐","authors":"A. Wibawa, A. Nafalski, Wayah Firdaus Mahmudy","doi":"10.1109/CYBERNETICSCOM.2013.6865773","DOIUrl":null,"url":null,"abstract":"A machine translation is developed to preserve the existence of Javanese speech levels. The machine translation relies on a phrase-based bi-text alignment to form the language corpora. The edit shifting distance is applied to increase the alignment efficiency. However, improper alignment contributed by recorded impossible pair and insufficient data training is still detected. This paper proposes a new improvement of the developed alignment algorithm based on the impossible pair restriction. The paper compares three situations: the fundamental approach (AL1) the basic algorithm with extended data training (AL2) and improved algorithm with standard data training (AL3). Based on experimental results, AL3 (A=90.5%)is remarkably accurate than AL1 (A=79.6%) and AL2 (A=85.9).","PeriodicalId":351051,"journal":{"name":"2013 IEEE International Conference on Computational Intelligence and Cybernetics (CYBERNETICSCOM)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Javanese speech levels machine translation: Improved parallel text alignment based on impossible pair limitation\",\"authors\":\"A. Wibawa, A. Nafalski, Wayah Firdaus Mahmudy\",\"doi\":\"10.1109/CYBERNETICSCOM.2013.6865773\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A machine translation is developed to preserve the existence of Javanese speech levels. The machine translation relies on a phrase-based bi-text alignment to form the language corpora. The edit shifting distance is applied to increase the alignment efficiency. However, improper alignment contributed by recorded impossible pair and insufficient data training is still detected. This paper proposes a new improvement of the developed alignment algorithm based on the impossible pair restriction. The paper compares three situations: the fundamental approach (AL1) the basic algorithm with extended data training (AL2) and improved algorithm with standard data training (AL3). Based on experimental results, AL3 (A=90.5%)is remarkably accurate than AL1 (A=79.6%) and AL2 (A=85.9).\",\"PeriodicalId\":351051,\"journal\":{\"name\":\"2013 IEEE International Conference on Computational Intelligence and Cybernetics (CYBERNETICSCOM)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE International Conference on Computational Intelligence and Cybernetics (CYBERNETICSCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CYBERNETICSCOM.2013.6865773\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE International Conference on Computational Intelligence and Cybernetics (CYBERNETICSCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CYBERNETICSCOM.2013.6865773","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

为了保留爪哇语语音水平的存在,开发了一种机器翻译。机器翻译依靠基于短语的双文本对齐来形成语言语料库。采用编辑移位距离提高对齐效率。然而,由于记录的不可能对和数据训练不足导致的不正确对齐仍然存在。本文提出了一种基于不可能对约束的改进算法。本文比较了三种情况:基本方法(AL1)、扩展数据训练的基本算法(AL2)和标准数据训练的改进算法(AL3)。实验结果表明,AL3 (A=90.5%)的准确率显著高于AL1 (A=79.6%)和AL2 (A=85.9)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Javanese speech levels machine translation: Improved parallel text alignment based on impossible pair limitation
A machine translation is developed to preserve the existence of Javanese speech levels. The machine translation relies on a phrase-based bi-text alignment to form the language corpora. The edit shifting distance is applied to increase the alignment efficiency. However, improper alignment contributed by recorded impossible pair and insufficient data training is still detected. This paper proposes a new improvement of the developed alignment algorithm based on the impossible pair restriction. The paper compares three situations: the fundamental approach (AL1) the basic algorithm with extended data training (AL2) and improved algorithm with standard data training (AL3). Based on experimental results, AL3 (A=90.5%)is remarkably accurate than AL1 (A=79.6%) and AL2 (A=85.9).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信