A Hybrid Rules and Statistical Method for Arabic to English Machine Translation

Arwa Alqudsi, N. Omar, Khalid Shaker
{"title":"A Hybrid Rules and Statistical Method for Arabic to English Machine Translation","authors":"Arwa Alqudsi, N. Omar, Khalid Shaker","doi":"10.1109/CAIS.2019.8769545","DOIUrl":null,"url":null,"abstract":"Arabic is one of the six major world languages. It originated in the area currently known as the Arabian Peninsula. Arabic is the joint official language in Middle Eastern and African states. Large communities of Arabic speakers have existed outside of the Middle East since the end of the last century, particularly in the United States and Europe. So finding a quick and efficient Arabic machine translator has become an urgent necessity, due to the differences between the languages spoken in the world's communities and the vast development that has occurred worldwide. Arabic combines many of the significant challenges of other languages like word order and ambiguity. The word ordering problem because of Arabic has four sentence structures which allow different word orders. Ambiguity in the Arabic language is a notorious problem because of the richness and complexity of Arabic morphology. The core problems in machine translation are reordering the words and estimating the right word translation among many options in the lexicon. The Rule-Based Machine translation (RBMT) approach is the way to reorder words, and the statistical approach, such as Expectation Maximisation (EM), is the way to select right word translations and count word frequencies. Combining RBMT with EM plays an impotent role in generating a good-quality MT. This paper presents a combination of the rule-based machine translation (RBMT) approach with the Expectation Maximisation (EM) algorithm. These two techniques have been applied successfully to word ordering and ambiguity problems in Arabic-to-English machine translation.","PeriodicalId":220129,"journal":{"name":"2019 2nd International Conference on Computer Applications & Information Security (ICCAIS)","volume":"98 12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 2nd International Conference on Computer Applications & Information Security (ICCAIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAIS.2019.8769545","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Arabic is one of the six major world languages. It originated in the area currently known as the Arabian Peninsula. Arabic is the joint official language in Middle Eastern and African states. Large communities of Arabic speakers have existed outside of the Middle East since the end of the last century, particularly in the United States and Europe. So finding a quick and efficient Arabic machine translator has become an urgent necessity, due to the differences between the languages spoken in the world's communities and the vast development that has occurred worldwide. Arabic combines many of the significant challenges of other languages like word order and ambiguity. The word ordering problem because of Arabic has four sentence structures which allow different word orders. Ambiguity in the Arabic language is a notorious problem because of the richness and complexity of Arabic morphology. The core problems in machine translation are reordering the words and estimating the right word translation among many options in the lexicon. The Rule-Based Machine translation (RBMT) approach is the way to reorder words, and the statistical approach, such as Expectation Maximisation (EM), is the way to select right word translations and count word frequencies. Combining RBMT with EM plays an impotent role in generating a good-quality MT. This paper presents a combination of the rule-based machine translation (RBMT) approach with the Expectation Maximisation (EM) algorithm. These two techniques have been applied successfully to word ordering and ambiguity problems in Arabic-to-English machine translation.
阿拉伯语到英语机器翻译的混合规则和统计方法
阿拉伯语是世界六大语言之一。它起源于现在被称为阿拉伯半岛的地区。阿拉伯语是中东和非洲国家的联合官方语言。自上世纪末以来,在中东以外,特别是在美国和欧洲,存在着大量讲阿拉伯语的社区。因此,由于世界各地社区使用的语言之间存在差异,以及世界范围内发生的巨大发展,寻找一种快速高效的阿拉伯语机器翻译已成为迫切需要。阿拉伯语结合了其他语言的许多重大挑战,如词序和歧义。由于阿拉伯语的词序问题有四种句子结构允许不同的词序。由于阿拉伯语词法的丰富性和复杂性,歧义在阿拉伯语中是一个臭名昭著的问题。机器翻译的核心问题是对词汇进行重新排序,并在词典中的众多选项中估计出正确的单词翻译。基于规则的机器翻译(RBMT)方法是重新排序单词的方法,而统计方法,如期望最大化(EM),是选择正确的单词翻译和计算单词频率的方法。本文提出了基于规则的机器翻译(RBMT)方法与期望最大化(EM)算法的结合。这两种技术已成功地应用于阿英机器翻译中的词序和歧义问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信