探索ChatGPT在多领域机器翻译中增强后期编辑的潜力:挑战和机遇。

IF 3 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Frontiers in Artificial Intelligence Pub Date : 2025-05-01 eCollection Date: 2025-01-01 DOI:10.3389/frai.2025.1526293
Jeehaan Algaraady, Mohammad Mahyoob
{"title":"探索ChatGPT在多领域机器翻译中增强后期编辑的潜力:挑战和机遇。","authors":"Jeehaan Algaraady, Mohammad Mahyoob","doi":"10.3389/frai.2025.1526293","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Post-editing plays a crucial role in enhancing the quality of machine-generated translation (MGT) by correcting errors and ensuring cohesion and coherence. With advancements in artificial intelligence, Large Language Models (LLMs) like ChatGPT-4o offer promising capabilities for post-editing tasks. This study investigates the effectiveness of ChatGPT-4o as a natural language processing tool in post-editing Arabic translations across various domains, aiming to evaluate its performance in improving productivity, accuracy, consistency, and overall translation quality.</p><p><strong>Methods: </strong>The study involved a comparative analysis of Arabic translations generated by Google Translate. These texts, drawn from multiple domains, were post-edited by two professional human translators and ChatGPT-4o. Subsequently, three additional professional human post-editors evaluated both sets of post-edited outputs. To statistically assess the differences in quality between humans and ChatGPT-4o post-edits, a paired <i>t</i>-test was employed, focusing on metrics such as fluency, accuracy, coherence, and efficiency.</p><p><strong>Results: </strong>The findings indicated that human post-editors outperformed ChatGPT-4o in most quality metrics. However, ChatGPT-4o demonstrated superior efficiency, yielding a positive <i>t</i>-statistic of 8.00 and a <i>p</i>-value of 0.015, indicating a statistically significant difference. Regarding fluency, no significant difference was observed between the two methods (<i>t</i>-statistic = -3.5, <i>p</i>-value = 0.074), suggesting comparable performance in ensuring the natural flow of text.</p><p><strong>Discussion: </strong>ChatGPT-4o showed competitive performance in English-to-Arabic post-editing, particularly in producing fluent, coherent, and stylistically consistent text. Its conversational design enables efficient and consistent editing across various domains. Nonetheless, the model faced challenges in handling grammatical and syntactic nuances, domain-specific idioms, and complex terminology, especially in medical and sports contexts. Overall, the study highlights the potential of ChatGPT-4o as a supportive tool in translation post-editing workflows, complementing human translators by enhancing productivity and maintaining acceptable quality standards.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1526293"},"PeriodicalIF":3.0000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12078335/pdf/","citationCount":"0","resultStr":"{\"title\":\"Exploring ChatGPT's potential for augmenting post-editing in machine translation across multiple domains: challenges and opportunities.\",\"authors\":\"Jeehaan Algaraady, Mohammad Mahyoob\",\"doi\":\"10.3389/frai.2025.1526293\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Post-editing plays a crucial role in enhancing the quality of machine-generated translation (MGT) by correcting errors and ensuring cohesion and coherence. With advancements in artificial intelligence, Large Language Models (LLMs) like ChatGPT-4o offer promising capabilities for post-editing tasks. This study investigates the effectiveness of ChatGPT-4o as a natural language processing tool in post-editing Arabic translations across various domains, aiming to evaluate its performance in improving productivity, accuracy, consistency, and overall translation quality.</p><p><strong>Methods: </strong>The study involved a comparative analysis of Arabic translations generated by Google Translate. These texts, drawn from multiple domains, were post-edited by two professional human translators and ChatGPT-4o. Subsequently, three additional professional human post-editors evaluated both sets of post-edited outputs. To statistically assess the differences in quality between humans and ChatGPT-4o post-edits, a paired <i>t</i>-test was employed, focusing on metrics such as fluency, accuracy, coherence, and efficiency.</p><p><strong>Results: </strong>The findings indicated that human post-editors outperformed ChatGPT-4o in most quality metrics. However, ChatGPT-4o demonstrated superior efficiency, yielding a positive <i>t</i>-statistic of 8.00 and a <i>p</i>-value of 0.015, indicating a statistically significant difference. Regarding fluency, no significant difference was observed between the two methods (<i>t</i>-statistic = -3.5, <i>p</i>-value = 0.074), suggesting comparable performance in ensuring the natural flow of text.</p><p><strong>Discussion: </strong>ChatGPT-4o showed competitive performance in English-to-Arabic post-editing, particularly in producing fluent, coherent, and stylistically consistent text. Its conversational design enables efficient and consistent editing across various domains. Nonetheless, the model faced challenges in handling grammatical and syntactic nuances, domain-specific idioms, and complex terminology, especially in medical and sports contexts. Overall, the study highlights the potential of ChatGPT-4o as a supportive tool in translation post-editing workflows, complementing human translators by enhancing productivity and maintaining acceptable quality standards.</p>\",\"PeriodicalId\":33315,\"journal\":{\"name\":\"Frontiers in Artificial Intelligence\",\"volume\":\"8 \",\"pages\":\"1526293\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12078335/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/frai.2025.1526293\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frai.2025.1526293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

译后编辑是提高机器生成翻译质量的关键环节,它能纠正译文中的错误,保证译文的衔接和连贯。随着人工智能的进步,像chatgpt - 40这样的大型语言模型(llm)为后期编辑任务提供了有前途的功能。本研究考察了chatgpt - 40作为一种自然语言处理工具在不同领域阿拉伯语翻译后期编辑中的有效性,旨在评估其在提高生产力、准确性、一致性和整体翻译质量方面的表现。方法:对谷歌Translate生成的阿拉伯文译文进行对比分析。这些文本来自多个领域,由两名专业翻译人员和chatgpt - 40进行后期编辑。随后,另外三名专业的人类后期编辑评估了两组后期编辑的输出。为了统计评估人类和chatgpt - 40后期编辑之间的质量差异,采用配对t检验,重点关注流畅性、准确性、连贯性和效率等指标。结果:研究结果表明,人类后期编辑优于chatgpt - 40在大多数质量指标。然而,chatgpt - 40显示出更好的效率,其正t统计量为8.00,p值为0.015,表明差异具有统计学意义。在流畅性方面,两种方法之间没有显著差异(t-statistic = -3.5, p值= 0.074),表明在确保文本自然流畅方面的性能相当。讨论:chatgpt - 40在英语到阿拉伯语的后期编辑方面表现出色,特别是在产生流畅、连贯和风格一致的文本方面。它的对话式设计使跨不同领域的编辑变得高效和一致。尽管如此,该模型在处理语法和句法的细微差别、特定于领域的习语和复杂术语方面面临挑战,特别是在医学和体育上下文中。总体而言,该研究强调了chatgpt - 40作为翻译后期编辑工作流程中的辅助工具的潜力,通过提高生产力和保持可接受的质量标准来补充人工翻译。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Exploring ChatGPT's potential for augmenting post-editing in machine translation across multiple domains: challenges and opportunities.

Introduction: Post-editing plays a crucial role in enhancing the quality of machine-generated translation (MGT) by correcting errors and ensuring cohesion and coherence. With advancements in artificial intelligence, Large Language Models (LLMs) like ChatGPT-4o offer promising capabilities for post-editing tasks. This study investigates the effectiveness of ChatGPT-4o as a natural language processing tool in post-editing Arabic translations across various domains, aiming to evaluate its performance in improving productivity, accuracy, consistency, and overall translation quality.

Methods: The study involved a comparative analysis of Arabic translations generated by Google Translate. These texts, drawn from multiple domains, were post-edited by two professional human translators and ChatGPT-4o. Subsequently, three additional professional human post-editors evaluated both sets of post-edited outputs. To statistically assess the differences in quality between humans and ChatGPT-4o post-edits, a paired t-test was employed, focusing on metrics such as fluency, accuracy, coherence, and efficiency.

Results: The findings indicated that human post-editors outperformed ChatGPT-4o in most quality metrics. However, ChatGPT-4o demonstrated superior efficiency, yielding a positive t-statistic of 8.00 and a p-value of 0.015, indicating a statistically significant difference. Regarding fluency, no significant difference was observed between the two methods (t-statistic = -3.5, p-value = 0.074), suggesting comparable performance in ensuring the natural flow of text.

Discussion: ChatGPT-4o showed competitive performance in English-to-Arabic post-editing, particularly in producing fluent, coherent, and stylistically consistent text. Its conversational design enables efficient and consistent editing across various domains. Nonetheless, the model faced challenges in handling grammatical and syntactic nuances, domain-specific idioms, and complex terminology, especially in medical and sports contexts. Overall, the study highlights the potential of ChatGPT-4o as a supportive tool in translation post-editing workflows, complementing human translators by enhancing productivity and maintaining acceptable quality standards.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
6.10
自引率
2.50%
发文量
272
审稿时长
13 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信