Source language classification of indirect translations

IF 0.8 2区文学 0 LANGUAGE & LINGUISTICS

Target-International Journal of Translation Studies Pub Date : 2022-04-11 DOI:10.1075/target.00006.iva

I. Ivaska, Laura Ivaska

{"title":"Source language classification of indirect translations","authors":"I. Ivaska, Laura Ivaska","doi":"10.1075/target.00006.iva","DOIUrl":null,"url":null,"abstract":"\n One of the major barriers to the systematic study of indirect translation – that is, translations of\n translations – is the lack of efficient methods to identify these translations. In this article, we use supervised machine\n learning to examine whether computers can be harnessed to identify indirect translations. Our data consist of a monolingual\n comparable corpus that includes (1) nontranslated Finnish texts, (2) direct translations from English, French, German, Greek, and\n Swedish into Finnish, and (3) indirect translations from Greek (the ultimate source language) via English, French, German, and\n Swedish (mediating languages) into Finnish. We use n-grams of various types and lengths as feature sets and random forests as the\n statistical classification technique. To maximize the transferability of the method, the feature sets were implemented in\n accordance with the Universal Dependencies framework. This study confirms that computers can distinguish between translated and\n nontranslated Finnish, as well as between Finnish translations made from different source languages. Regarding indirect\n translations, the ultimate source language has a greater impact on the linguistic composition of indirect Finnish translations\n than their respective mediating languages. Hence, the indirect translations could not be reliably identified. Therefore, our\n results suggest that the reliable computational identification of indirect translations and their mediating languages requires a\n way to control for the effect of the ultimate source language.","PeriodicalId":51739,"journal":{"name":"Target-International Journal of Translation Studies","volume":"1 1","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2022-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Target-International Journal of Translation Studies","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1075/target.00006.iva","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}

引用次数: 3

Abstract

One of the major barriers to the systematic study of indirect translation – that is, translations of translations – is the lack of efficient methods to identify these translations. In this article, we use supervised machine learning to examine whether computers can be harnessed to identify indirect translations. Our data consist of a monolingual comparable corpus that includes (1) nontranslated Finnish texts, (2) direct translations from English, French, German, Greek, and Swedish into Finnish, and (3) indirect translations from Greek (the ultimate source language) via English, French, German, and Swedish (mediating languages) into Finnish. We use n-grams of various types and lengths as feature sets and random forests as the statistical classification technique. To maximize the transferability of the method, the feature sets were implemented in accordance with the Universal Dependencies framework. This study confirms that computers can distinguish between translated and nontranslated Finnish, as well as between Finnish translations made from different source languages. Regarding indirect translations, the ultimate source language has a greater impact on the linguistic composition of indirect Finnish translations than their respective mediating languages. Hence, the indirect translations could not be reliably identified. Therefore, our results suggest that the reliable computational identification of indirect translations and their mediating languages requires a way to control for the effect of the ultimate source language.

查看原文本刊更多论文

间接翻译的源语言分类

对间接翻译(即译文的翻译)进行系统研究的主要障碍之一是缺乏有效的方法来识别这些翻译。在本文中，我们使用监督机器学习来检查计算机是否可以用来识别间接翻译。我们的数据由一个单语可比语料库组成，其中包括(1)未翻译的芬兰语文本，(2)从英语、法语、德语、希腊语和瑞典语直接翻译成芬兰语，以及(3)通过英语、法语、德语和瑞典语(中介语言)从希腊语(最终源语言)间接翻译成芬兰语。我们使用不同类型和长度的n-图作为特征集，使用随机森林作为统计分类技术。为了最大限度地提高方法的可移植性，根据通用依赖框架实现了特征集。这项研究证实，计算机可以区分翻译和非翻译的芬兰语，以及不同源语言的芬兰语翻译。在间接翻译方面，最终源语对芬兰语间接翻译的语言构成的影响要大于它们各自的中介语。因此，间接翻译不能可靠地识别。因此，我们的研究结果表明，间接翻译及其中介语言的可靠计算识别需要一种控制最终源语言影响的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Target-International Journal of Translation Studies Multiple-

CiteScore

3.10

自引率

0.00%

发文量

期刊介绍： Target promotes the scholarly study of translational phenomena from any part of the world and welcomes submissions of an interdisciplinary nature. The journal"s focus is on research on the theory, history, culture and sociology of translation and on the description and pedagogy that underpin and interact with these foci. We welcome contributions that report on empirical studies as well as speculative and applied studies. We do not publish papers on purely practical matters, and prospective contributors are advised not to submit masters theses in their raw state.