Source language classification of indirect translations

IF 16.4 1区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
I. Ivaska, Laura Ivaska
{"title":"Source language classification of indirect translations","authors":"I. Ivaska, Laura Ivaska","doi":"10.1075/target.00006.iva","DOIUrl":null,"url":null,"abstract":"\n One of the major barriers to the systematic study of indirect translation – that is, translations of\n translations – is the lack of efficient methods to identify these translations. In this article, we use supervised machine\n learning to examine whether computers can be harnessed to identify indirect translations. Our data consist of a monolingual\n comparable corpus that includes (1) nontranslated Finnish texts, (2) direct translations from English, French, German, Greek, and\n Swedish into Finnish, and (3) indirect translations from Greek (the ultimate source language) via English, French, German, and\n Swedish (mediating languages) into Finnish. We use n-grams of various types and lengths as feature sets and random forests as the\n statistical classification technique. To maximize the transferability of the method, the feature sets were implemented in\n accordance with the Universal Dependencies framework. This study confirms that computers can distinguish between translated and\n nontranslated Finnish, as well as between Finnish translations made from different source languages. Regarding indirect\n translations, the ultimate source language has a greater impact on the linguistic composition of indirect Finnish translations\n than their respective mediating languages. Hence, the indirect translations could not be reliably identified. Therefore, our\n results suggest that the reliable computational identification of indirect translations and their mediating languages requires a\n way to control for the effect of the ultimate source language.","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2022-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1075/target.00006.iva","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 3

Abstract

One of the major barriers to the systematic study of indirect translation – that is, translations of translations – is the lack of efficient methods to identify these translations. In this article, we use supervised machine learning to examine whether computers can be harnessed to identify indirect translations. Our data consist of a monolingual comparable corpus that includes (1) nontranslated Finnish texts, (2) direct translations from English, French, German, Greek, and Swedish into Finnish, and (3) indirect translations from Greek (the ultimate source language) via English, French, German, and Swedish (mediating languages) into Finnish. We use n-grams of various types and lengths as feature sets and random forests as the statistical classification technique. To maximize the transferability of the method, the feature sets were implemented in accordance with the Universal Dependencies framework. This study confirms that computers can distinguish between translated and nontranslated Finnish, as well as between Finnish translations made from different source languages. Regarding indirect translations, the ultimate source language has a greater impact on the linguistic composition of indirect Finnish translations than their respective mediating languages. Hence, the indirect translations could not be reliably identified. Therefore, our results suggest that the reliable computational identification of indirect translations and their mediating languages requires a way to control for the effect of the ultimate source language.
间接翻译的源语言分类
对间接翻译(即译文的翻译)进行系统研究的主要障碍之一是缺乏有效的方法来识别这些翻译。在本文中,我们使用监督机器学习来检查计算机是否可以用来识别间接翻译。我们的数据由一个单语可比语料库组成,其中包括(1)未翻译的芬兰语文本,(2)从英语、法语、德语、希腊语和瑞典语直接翻译成芬兰语,以及(3)通过英语、法语、德语和瑞典语(中介语言)从希腊语(最终源语言)间接翻译成芬兰语。我们使用不同类型和长度的n-图作为特征集,使用随机森林作为统计分类技术。为了最大限度地提高方法的可移植性,根据通用依赖框架实现了特征集。这项研究证实,计算机可以区分翻译和非翻译的芬兰语,以及不同源语言的芬兰语翻译。在间接翻译方面,最终源语对芬兰语间接翻译的语言构成的影响要大于它们各自的中介语。因此,间接翻译不能可靠地识别。因此,我们的研究结果表明,间接翻译及其中介语言的可靠计算识别需要一种控制最终源语言影响的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Accounts of Chemical Research
Accounts of Chemical Research 化学-化学综合
CiteScore
31.40
自引率
1.10%
发文量
312
审稿时长
2 months
期刊介绍: Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance. Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信