The Negative Transfer Effect on the Neural Machine Translation of Egyptian Arabic Adjuncts into English: The Case of Google Translate

Q1 Arts and Humanities
Rania Al-Sabbagh
{"title":"The Negative Transfer Effect on the Neural Machine Translation of Egyptian Arabic Adjuncts into English: The Case of Google Translate","authors":"Rania Al-Sabbagh","doi":"10.33806/ijaes.v24i1.560","DOIUrl":null,"url":null,"abstract":"Parallel corpora for low-resource Arabic dialects and English are limited and small-scale, and most neural machine translation models, including Google Translate, rely mainly on parallel corpora of standard Arabic and English to train for dialectal Arabic translation. A model well trained to translate to and from standard Arabic is believed to efficiently translate dialectal Arabic, given their similarities. This study demonstrates the impact of not using large-scale, dialect-specific parallel corpora by quantitatively and qualitatively analyzing the performance of Google Translate in translating Egyptian Arabic adjuncts. Compared to human reference translation, Google Translate achieved a low BLEU score of 14.69. Qualitative analysis showed that reliance on standard Arabic parallel corpora caused a negative transfer problem manifested in the literal translation of idiomatic adjuncts, the misinterpretation of dialectal adjuncts as main clause constituents, the translation of dialectal adjuncts after orthographically similar standard Arabic words, and the use of standard Arabic common lexical meanings to translate dialect-specific adjuncts. This study’s findings will be relevant for researchers interested in dialectal Arabic neural machine translation and has implications for investment in the development of large-scale, dialect-specific corpora to better process the peculiarities of Arabic dialects and reduce the effect of negative transfer from standard Arabic.","PeriodicalId":37677,"journal":{"name":"International Journal of Arabic-English Studies","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Arabic-English Studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33806/ijaes.v24i1.560","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0

Abstract

Parallel corpora for low-resource Arabic dialects and English are limited and small-scale, and most neural machine translation models, including Google Translate, rely mainly on parallel corpora of standard Arabic and English to train for dialectal Arabic translation. A model well trained to translate to and from standard Arabic is believed to efficiently translate dialectal Arabic, given their similarities. This study demonstrates the impact of not using large-scale, dialect-specific parallel corpora by quantitatively and qualitatively analyzing the performance of Google Translate in translating Egyptian Arabic adjuncts. Compared to human reference translation, Google Translate achieved a low BLEU score of 14.69. Qualitative analysis showed that reliance on standard Arabic parallel corpora caused a negative transfer problem manifested in the literal translation of idiomatic adjuncts, the misinterpretation of dialectal adjuncts as main clause constituents, the translation of dialectal adjuncts after orthographically similar standard Arabic words, and the use of standard Arabic common lexical meanings to translate dialect-specific adjuncts. This study’s findings will be relevant for researchers interested in dialectal Arabic neural machine translation and has implications for investment in the development of large-scale, dialect-specific corpora to better process the peculiarities of Arabic dialects and reduce the effect of negative transfer from standard Arabic.
埃及阿拉伯语修饰语神经机器翻译的负迁移效应:以谷歌翻译为例
针对低资源阿拉伯语方言和英语的平行语料库有限且规模较小,包括Google翻译在内的大多数神经机器翻译模型主要依靠标准阿拉伯语和英语的平行语料库进行阿拉伯语方言翻译的训练。考虑到阿拉伯方言与标准阿拉伯语的相似性,一个训练有素的模型被认为可以有效地翻译阿拉伯方言。本研究通过定量和定性分析谷歌翻译在翻译埃及阿拉伯语修饰语中的表现,证明了不使用大规模、方言特定的平行语料库的影响。与人类参考翻译相比,谷歌翻译的BLEU得分很低,只有14.69分。定性分析表明,对标准阿拉伯语平行语料库的依赖导致了对习惯语料库的直译、方言语料库作为主句成分的误读、方言语料库在标准阿拉伯语词法相近的词后翻译、使用标准阿拉伯语常用词义翻译方言特有语料库等方面的负迁移问题。本研究的发现将对对方言阿拉伯语神经机器翻译感兴趣的研究人员有意义,并对投资开发大规模的方言专用语料库,以更好地处理阿拉伯方言的特点,减少标准阿拉伯语的负迁移影响具有指导意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Arabic-English Studies
International Journal of Arabic-English Studies Arts and Humanities-Literature and Literary Theory
CiteScore
1.10
自引率
0.00%
发文量
33
期刊介绍: The aim of this international refereed journal is to promote original research into cross-language and cross-cultural studies in general, and Arabic-English contrastive and comparative studies in particular. Within this framework, the journal welcomes contributions to such areas of interest as comparative literature, contrastive textology, contrastive linguistics, lexicology, stylistics, and translation studies. The journal is also interested in theoretical and practical research on both English and Arabic as well as in foreign language education in the Arab world. Reviews of important, up-to- date, relevant publications in English and Arabic are also welcome. In addition to articles and book reviews, IJAES has room for notes, discussion and relevant academic presentations and reports. These may consist of comments, statements on current issues, short reports on ongoing research, or short replies to other articles. The International Journal of Arabic-English Studies (IJAES) is the forum of debate and research for the Association of Professors of English and Translation at Arab Universities (APETAU). However, contributions from scholars involved in language, literature and translation across language communities are invited.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信