Translation Performance from the User’s Perspective of Large Language Models and Neural Machine Translation Systems

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Jungha Son, Boyoung Kim
{"title":"Translation Performance from the User’s Perspective of Large Language Models and Neural Machine Translation Systems","authors":"Jungha Son, Boyoung Kim","doi":"10.3390/info14100574","DOIUrl":null,"url":null,"abstract":"The rapid global expansion of ChatGPT, which plays a crucial role in interactive knowledge sharing and translation, underscores the importance of comparative performance assessments in artificial intelligence (AI) technology. This study concentrated on this crucial issue by exploring and contrasting the translation performances of large language models (LLMs) and neural machine translation (NMT) systems. For this aim, the APIs of Google Translate, Microsoft Translator, and OpenAI’s ChatGPT were utilized, leveraging parallel corpora from the Workshop on Machine Translation (WMT) 2018 and 2020 benchmarks. By applying recognized evaluation metrics such as BLEU, chrF, and TER, a comprehensive performance analysis across a variety of language pairs, translation directions, and reference token sizes was conducted. The findings reveal that while Google Translate and Microsoft Translator generally surpass ChatGPT in terms of their BLEU, chrF, and TER scores, ChatGPT exhibits superior performance in specific language pairs. Translations from non-English to English consistently yielded better results across all three systems compared with translations from English to non-English. Significantly, an improvement in translation system performance was observed as the token size increased, hinting at the potential benefits of training models on larger token sizes.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information (Switzerland)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/info14100574","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The rapid global expansion of ChatGPT, which plays a crucial role in interactive knowledge sharing and translation, underscores the importance of comparative performance assessments in artificial intelligence (AI) technology. This study concentrated on this crucial issue by exploring and contrasting the translation performances of large language models (LLMs) and neural machine translation (NMT) systems. For this aim, the APIs of Google Translate, Microsoft Translator, and OpenAI’s ChatGPT were utilized, leveraging parallel corpora from the Workshop on Machine Translation (WMT) 2018 and 2020 benchmarks. By applying recognized evaluation metrics such as BLEU, chrF, and TER, a comprehensive performance analysis across a variety of language pairs, translation directions, and reference token sizes was conducted. The findings reveal that while Google Translate and Microsoft Translator generally surpass ChatGPT in terms of their BLEU, chrF, and TER scores, ChatGPT exhibits superior performance in specific language pairs. Translations from non-English to English consistently yielded better results across all three systems compared with translations from English to non-English. Significantly, an improvement in translation system performance was observed as the token size increased, hinting at the potential benefits of training models on larger token sizes.
基于用户视角的大语言模型和神经机器翻译系统的翻译性能
ChatGPT在交互式知识共享和翻译中发挥着至关重要的作用,它在全球的迅速扩张凸显了人工智能(AI)技术中比较绩效评估的重要性。本研究通过探索和对比大型语言模型(llm)和神经机器翻译(NMT)系统的翻译性能来关注这一关键问题。为此,我们利用了谷歌翻译、微软翻译和OpenAI的ChatGPT的api,并利用了2018年和2020年机器翻译研讨会(WMT)基准的平行语料库。通过应用BLEU、chrF和TER等公认的评估指标,对各种语言对、翻译方向和参考标记大小进行了全面的性能分析。研究结果显示,虽然谷歌翻译和微软翻译在BLEU、chrF和TER得分方面普遍超过ChatGPT,但ChatGPT在特定语言对上表现优异。与从英语到非英语的翻译相比,从非英语到英语的翻译在所有三个系统中始终产生更好的结果。值得注意的是,随着令牌大小的增加,可以观察到翻译系统性能的改善,这暗示了在更大的令牌大小上训练模型的潜在好处。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information (Switzerland)
Information (Switzerland) Computer Science-Information Systems
CiteScore
6.90
自引率
0.00%
发文量
515
审稿时长
11 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信