基于级联计算图的多语言神经机器翻译

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Abouzar Qorbani , Reza Ramezani , Ahmad Baraani , Arefeh Kazemi
{"title":"基于级联计算图的多语言神经机器翻译","authors":"Abouzar Qorbani ,&nbsp;Reza Ramezani ,&nbsp;Ahmad Baraani ,&nbsp;Arefeh Kazemi","doi":"10.1016/j.eswa.2025.128722","DOIUrl":null,"url":null,"abstract":"<div><div>In the era of artificial intelligence, multilingual models have become increasingly vital in machine translation tasks. However, Multilingual Neural Machine Translation (MNMT) faces persistent challenges, notably reduced translation quality and language interference. When training on diverse language pairs, the translation performance for certain languages may degrade due to negative transfer effects. To address this problem, researchers have proposed various strategies such as parameter sharing, partial sharing, and language-specific parameterization. Despite these efforts, limitations remain—including high data requirements, reliance on linguistic relatedness, inflexibility in model architecture adaptation during training, and negative inference (producing output in an unintended language). The identification and targeted modification of effective and ineffective nodes within a neural model can effectively enhance the translation performance, particularly for low-resource and extremely low-resource languages. In this paper, a novel method is proposed in this study that identifies ineffective nodes in an MNMT model and corrects them by twinning with effective counterparts. This is achieved through computational graph grouping based on semantic similarity. The proposed method has been evaluated on several multilingual datasets, including TED2013, TED2020, and BIBLE. Relative to baseline models, the proposed method demonstrates notable improvements in BLEU scores—achieving relative gains of 23.7 % on TED2013, 7.06 % on TED2020, and 16.9 % on BIBLE. It also outperforms large-scale systems such as ChatGPT, Bing GPT-4, and Google Neural Machine Translation (GNMT) across all evaluated datasets. Furthermore, the performance has been assessed on the extremely low-resource language pair English–Igbo using the OPUS-100 dataset. The results show that the proposed method outperforms baseline models by 2.58 %, while the large-scale Madlad400-3B model, despite its depth (32 layers, 450 languages), struggles in this setting. Similarly, the Semlin-MNMT model performs well for high-resource pairs but shows significant degradation on low-resource languages. Overall, our proposed method provides a robust and scalable approach for enhancing MNMT quality in both one-to-many and many-to-many translation scenarios. Its effectiveness in low-resource and extremely low-resource settings highlights its practical value and contribution to the advancement of multilingual translation systems.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"294 ","pages":"Article 128722"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multilingual neural machine translation by cascading computational graphs\",\"authors\":\"Abouzar Qorbani ,&nbsp;Reza Ramezani ,&nbsp;Ahmad Baraani ,&nbsp;Arefeh Kazemi\",\"doi\":\"10.1016/j.eswa.2025.128722\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In the era of artificial intelligence, multilingual models have become increasingly vital in machine translation tasks. However, Multilingual Neural Machine Translation (MNMT) faces persistent challenges, notably reduced translation quality and language interference. When training on diverse language pairs, the translation performance for certain languages may degrade due to negative transfer effects. To address this problem, researchers have proposed various strategies such as parameter sharing, partial sharing, and language-specific parameterization. Despite these efforts, limitations remain—including high data requirements, reliance on linguistic relatedness, inflexibility in model architecture adaptation during training, and negative inference (producing output in an unintended language). The identification and targeted modification of effective and ineffective nodes within a neural model can effectively enhance the translation performance, particularly for low-resource and extremely low-resource languages. In this paper, a novel method is proposed in this study that identifies ineffective nodes in an MNMT model and corrects them by twinning with effective counterparts. This is achieved through computational graph grouping based on semantic similarity. The proposed method has been evaluated on several multilingual datasets, including TED2013, TED2020, and BIBLE. Relative to baseline models, the proposed method demonstrates notable improvements in BLEU scores—achieving relative gains of 23.7 % on TED2013, 7.06 % on TED2020, and 16.9 % on BIBLE. It also outperforms large-scale systems such as ChatGPT, Bing GPT-4, and Google Neural Machine Translation (GNMT) across all evaluated datasets. Furthermore, the performance has been assessed on the extremely low-resource language pair English–Igbo using the OPUS-100 dataset. The results show that the proposed method outperforms baseline models by 2.58 %, while the large-scale Madlad400-3B model, despite its depth (32 layers, 450 languages), struggles in this setting. Similarly, the Semlin-MNMT model performs well for high-resource pairs but shows significant degradation on low-resource languages. Overall, our proposed method provides a robust and scalable approach for enhancing MNMT quality in both one-to-many and many-to-many translation scenarios. Its effectiveness in low-resource and extremely low-resource settings highlights its practical value and contribution to the advancement of multilingual translation systems.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"294 \",\"pages\":\"Article 128722\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425023401\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425023401","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

在人工智能时代,多语言模型在机器翻译任务中变得越来越重要。然而,多语言神经机器翻译(MNMT)面临着持续的挑战,特别是翻译质量降低和语言干扰。在对不同语言对进行训练时,由于负迁移效应,某些语言的翻译性能可能会下降。为了解决这个问题,研究人员提出了各种策略,如参数共享、部分共享和特定于语言的参数化。尽管有这些努力,限制仍然存在——包括高数据需求、对语言相关性的依赖、训练期间模型架构适应的不灵活性,以及负面推理(以意想不到的语言产生输出)。神经模型中有效节点和无效节点的识别和有针对性的修改可以有效地提高翻译性能,特别是对于低资源和极低资源的语言。本文提出了一种新的方法,即识别MNMT模型中的无效节点,并通过与有效节点的孪生来纠正它们。这是通过基于语义相似度的计算图分组实现的。该方法已在多个多语言数据集上进行了评估,包括TED2013、TED2020和BIBLE。与基线模型相比,所提出的方法在BLEU分数方面取得了显著的改善——在TED2013上实现了23.7%的相对增益,在TED2020上实现了7.06%的相对增益,在BIBLE上实现了16.9%的相对增益。在所有评估的数据集上,它也优于ChatGPT、Bing GPT-4和谷歌神经机器翻译(GNMT)等大型系统。此外,使用OPUS-100数据集对极低资源的英语-伊博语对进行了性能评估。结果表明,该方法优于基线模型2.58%,而大规模Madlad400-3B模型尽管具有深度(32层,450种语言),但在此设置下表现不佳。类似地,Semlin-MNMT模型对高资源对表现良好,但对低资源语言表现出明显的退化。总的来说,我们提出的方法为提高一对多和多对多翻译场景下的MNMT质量提供了一种鲁棒性和可扩展性的方法。它在低资源和极低资源环境下的有效性突出了它的实用价值和对多语言翻译系统发展的贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multilingual neural machine translation by cascading computational graphs
In the era of artificial intelligence, multilingual models have become increasingly vital in machine translation tasks. However, Multilingual Neural Machine Translation (MNMT) faces persistent challenges, notably reduced translation quality and language interference. When training on diverse language pairs, the translation performance for certain languages may degrade due to negative transfer effects. To address this problem, researchers have proposed various strategies such as parameter sharing, partial sharing, and language-specific parameterization. Despite these efforts, limitations remain—including high data requirements, reliance on linguistic relatedness, inflexibility in model architecture adaptation during training, and negative inference (producing output in an unintended language). The identification and targeted modification of effective and ineffective nodes within a neural model can effectively enhance the translation performance, particularly for low-resource and extremely low-resource languages. In this paper, a novel method is proposed in this study that identifies ineffective nodes in an MNMT model and corrects them by twinning with effective counterparts. This is achieved through computational graph grouping based on semantic similarity. The proposed method has been evaluated on several multilingual datasets, including TED2013, TED2020, and BIBLE. Relative to baseline models, the proposed method demonstrates notable improvements in BLEU scores—achieving relative gains of 23.7 % on TED2013, 7.06 % on TED2020, and 16.9 % on BIBLE. It also outperforms large-scale systems such as ChatGPT, Bing GPT-4, and Google Neural Machine Translation (GNMT) across all evaluated datasets. Furthermore, the performance has been assessed on the extremely low-resource language pair English–Igbo using the OPUS-100 dataset. The results show that the proposed method outperforms baseline models by 2.58 %, while the large-scale Madlad400-3B model, despite its depth (32 layers, 450 languages), struggles in this setting. Similarly, the Semlin-MNMT model performs well for high-resource pairs but shows significant degradation on low-resource languages. Overall, our proposed method provides a robust and scalable approach for enhancing MNMT quality in both one-to-many and many-to-many translation scenarios. Its effectiveness in low-resource and extremely low-resource settings highlights its practical value and contribution to the advancement of multilingual translation systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Expert Systems with Applications
Expert Systems with Applications 工程技术-工程:电子与电气
CiteScore
13.80
自引率
10.60%
发文量
2045
审稿时长
8.7 months
期刊介绍: Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信