A Quantitative Analysis of Discourse Phenomena in Machine Translation

IF 0.5 Q3 LINGUISTICS
Carolina Scarton, Lucia Specia
{"title":"A Quantitative Analysis of Discourse Phenomena in Machine Translation","authors":"Carolina Scarton, Lucia Specia","doi":"10.4000/DISCOURS.9047","DOIUrl":null,"url":null,"abstract":"State-of-the-art Machine Translation (MT) systems translate documents by considering isolated sentences, disregarding information beyond sentence level. As a result, machine-translated documents often contain problems related to discourse coherence and cohesion. Recently, some initiatives in the evaluation and quality estimation of MT outputs have attempted to detect discourse problems in order to assess the quality of these machine translations. However, a quantitative analysis of discourse phenomena in MT outputs is still needed in order to better understand the phenomena and identify possible solutions or ways to improve evaluation. This paper aims to answer the following questions: What is the impact of discourse phenomena on MT quality? Can we capture and measure quantitatively any issues related to discourse in MT outputs? In order to answer these questions, we present a quantitative analysis of several discourse phenomena and correlate the resulting figures with scores from automatic translation quality evaluation metrics. We show that figures related to discourse phenomena present a higher correlation with quality scores than the baseline counts widely used for quality estimation of MT.","PeriodicalId":51977,"journal":{"name":"Discours-Revue de Linguistique Psycholinguistique et Informatique","volume":null,"pages":null},"PeriodicalIF":0.5000,"publicationDate":"2015-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Discours-Revue de Linguistique Psycholinguistique et Informatique","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4000/DISCOURS.9047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"LINGUISTICS","Score":null,"Total":0}
引用次数: 8

Abstract

State-of-the-art Machine Translation (MT) systems translate documents by considering isolated sentences, disregarding information beyond sentence level. As a result, machine-translated documents often contain problems related to discourse coherence and cohesion. Recently, some initiatives in the evaluation and quality estimation of MT outputs have attempted to detect discourse problems in order to assess the quality of these machine translations. However, a quantitative analysis of discourse phenomena in MT outputs is still needed in order to better understand the phenomena and identify possible solutions or ways to improve evaluation. This paper aims to answer the following questions: What is the impact of discourse phenomena on MT quality? Can we capture and measure quantitatively any issues related to discourse in MT outputs? In order to answer these questions, we present a quantitative analysis of several discourse phenomena and correlate the resulting figures with scores from automatic translation quality evaluation metrics. We show that figures related to discourse phenomena present a higher correlation with quality scores than the baseline counts widely used for quality estimation of MT.
机器翻译语篇现象的定量分析
最先进的机器翻译(MT)系统通过考虑孤立的句子来翻译文档,而忽略句子级别以外的信息。因此,机器翻译文档中经常存在语篇连贯和衔接的问题。最近,在机器翻译输出的评估和质量估计中,一些举措试图检测话语问题,以评估这些机器翻译的质量。然而,仍然需要对机器翻译输出中的语篇现象进行定量分析,以便更好地理解这些现象,并确定可能的解决方案或改进评估的方法。本文旨在回答以下问题:话语现象对机器翻译质量的影响是什么?我们能否在机器翻译输出中捕获和定量测量与话语相关的任何问题?为了回答这些问题,我们对几种话语现象进行了定量分析,并将结果与自动翻译质量评估指标的得分相关联。我们表明,与话语现象相关的数字与质量分数的相关性高于广泛用于机器翻译质量估计的基线计数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
11
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信