A method for automatic detection and manual localization of content-based translation errors and shifts

Journal of Innovation in Digital Ecosystems Pub Date : 2014-12-01 DOI:10.1016/j.jides.2015.02.004

Éric André Poirier

{"title":"A method for automatic detection and manual localization of content-based translation errors and shifts","authors":"Éric André Poirier","doi":"10.1016/j.jides.2015.02.004","DOIUrl":null,"url":null,"abstract":"<div><p>In this paper, we describe a method for automatic detection and manual localization of four compositional translation errors and shifts between automatically aligned segments in source and target languages. The automatic detection of errors and shifts is based on a content word precision algorithm which measures the equality of information content between source and target segments. The manual localization of errors and shifts within the segments is based on the compositionality principle. The method allows for the detection and localization of two potential errors; omission and addition, as well as two translation shifts required to avoid a translation error such as over-translation (when a correct translation results in more content words than in the source segment) and under-translation (when a correct translation results in less content words than in the source segment). Because of manual localization within bilingual pairs of segments, the method is not intended for automatic error detection but for human-assisted revision of translations. The analysis, described with the method and the algorithm is applied to real translation examples culled from a state-of-the-art translation corpus sampled for various translation errors. The algorithm and the localization method have implications for the development of more content-oriented natural language processing as well as for the training of professional translators; they can also be useful for formal and systematic description of content-based translation errors.</p></div>","PeriodicalId":100792,"journal":{"name":"Journal of Innovation in Digital Ecosystems","volume":"1 1","pages":"Pages 38-46"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.jides.2015.02.004","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Innovation in Digital Ecosystems","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S235266451500005X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

In this paper, we describe a method for automatic detection and manual localization of four compositional translation errors and shifts between automatically aligned segments in source and target languages. The automatic detection of errors and shifts is based on a content word precision algorithm which measures the equality of information content between source and target segments. The manual localization of errors and shifts within the segments is based on the compositionality principle. The method allows for the detection and localization of two potential errors; omission and addition, as well as two translation shifts required to avoid a translation error such as over-translation (when a correct translation results in more content words than in the source segment) and under-translation (when a correct translation results in less content words than in the source segment). Because of manual localization within bilingual pairs of segments, the method is not intended for automatic error detection but for human-assisted revision of translations. The analysis, described with the method and the algorithm is applied to real translation examples culled from a state-of-the-art translation corpus sampled for various translation errors. The algorithm and the localization method have implications for the development of more content-oriented natural language processing as well as for the training of professional translators; they can also be useful for formal and systematic description of content-based translation errors.

查看原文本刊更多论文

一种基于内容的翻译错误和移位的自动检测和手动定位方法

在本文中，我们描述了一种自动检测和手动定位源语言和目标语言中四种组合翻译错误和自动对齐片段之间偏移的方法。错误和移位的自动检测基于内容词精度算法，该算法测量源段和目标段之间信息内容的相等性。人工定位的误差和位移分段是基于组合性原则。该方法允许检测和定位两个潜在的错误;省略和添加，以及避免翻译错误所需的两次翻译转换，例如过度翻译(正确的翻译导致的内容词比源段多)和欠翻译(正确的翻译导致的内容词比源段少)。由于在双语片段对中需要手动定位，因此该方法不是用于自动错误检测，而是用于人工辅助的翻译修订。用该方法和算法描述的分析应用于从最先进的翻译语料库中挑选的真实翻译示例，以采样各种翻译错误。该算法和本地化方法对面向内容的自然语言处理的发展以及专业翻译人员的培训具有重要意义;它们还可以用于正式和系统地描述基于内容的翻译错误。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Innovation in Digital Ecosystems

自引率

0.00%

发文量