确保智能翻译的OPUS语料库工具包(以英语-乌克兰语电影语篇的L1和L2文本为例)

MESSENGER of Kyiv National Linguistic University. Series Philology Pub Date : 2023-03-14 DOI:10.32589/2311-0821.2.2022.274929

Y. Kapranov, Т. V. Тron, B. О. Іvanovska

{"title":"确保智能翻译的OPUS语料库工具包(以英语-乌克兰语电影语篇的L1和L2文本为例)","authors":"Y. Kapranov, Т. V. Тron, B. О. Іvanovska","doi":"10.32589/2311-0821.2.2022.274929","DOIUrl":null,"url":null,"abstract":"The article explains the concept of “translation memory” and defines it as a computer database where segments of texts of different L1 discourses are represented, as well as equivalents of these segments in L2. Computer-Aided Translation, Machine Translation and Parallel corpus toolkit are outlined as the main types of translation memory. In particular, Computer-Aided Translation is considered as the process of translating L1 text to obtain L2 by using specialized computer software. In this way, the human factor plays one of the most important missions in the process of performing Computer-Aided Translation, because the L1 text is subjected to three types of processing: pre-, inter- and post-editing. Machine Translation is viewed in a narrow sense as the process of translating a text from L1 to L2, that is performed by a computer in whole and/or in part, and in a broad sense as a branch of scientific research, that is in the focus of Linguistics, Mathematics and Cybernetics, and aims to build a system that implements Machine Translation in the narrow sense of this concept. Parallel corpus toolkit is a database with a set of L1 and L2 texts, that contains a large number of texts of different discourses, issues and topics. In addition, the attention is paid to the OPUS corpus toolkit as one of the translation memory types, which ensures the efficiency of the process of intelligent translation and is currently a free corpus system in the public domain, which contains corpora of texts from L1 and L2 to L3...Ln from numerous Internet resources and is constantly updated. The tested resource capabilities of the OPUS corpus tool have proved their effectiveness in the process of verification of one-, two-, and three-component L2 lexical constructs on the example of L1 and L2 text fragments belonging to film discourse.","PeriodicalId":217176,"journal":{"name":"MESSENGER of Kyiv National Linguistic University. Series Philology","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"OPUS corpus toolkit for ensuring intelligent translation (case study of L1 and L2 texts of English-Ukrainian film discourse)\",\"authors\":\"Y. Kapranov, Т. V. Тron, B. О. Іvanovska\",\"doi\":\"10.32589/2311-0821.2.2022.274929\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The article explains the concept of “translation memory” and defines it as a computer database where segments of texts of different L1 discourses are represented, as well as equivalents of these segments in L2. Computer-Aided Translation, Machine Translation and Parallel corpus toolkit are outlined as the main types of translation memory. In particular, Computer-Aided Translation is considered as the process of translating L1 text to obtain L2 by using specialized computer software. In this way, the human factor plays one of the most important missions in the process of performing Computer-Aided Translation, because the L1 text is subjected to three types of processing: pre-, inter- and post-editing. Machine Translation is viewed in a narrow sense as the process of translating a text from L1 to L2, that is performed by a computer in whole and/or in part, and in a broad sense as a branch of scientific research, that is in the focus of Linguistics, Mathematics and Cybernetics, and aims to build a system that implements Machine Translation in the narrow sense of this concept. Parallel corpus toolkit is a database with a set of L1 and L2 texts, that contains a large number of texts of different discourses, issues and topics. In addition, the attention is paid to the OPUS corpus toolkit as one of the translation memory types, which ensures the efficiency of the process of intelligent translation and is currently a free corpus system in the public domain, which contains corpora of texts from L1 and L2 to L3...Ln from numerous Internet resources and is constantly updated. The tested resource capabilities of the OPUS corpus tool have proved their effectiveness in the process of verification of one-, two-, and three-component L2 lexical constructs on the example of L1 and L2 text fragments belonging to film discourse.\",\"PeriodicalId\":217176,\"journal\":{\"name\":\"MESSENGER of Kyiv National Linguistic University. Series Philology\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MESSENGER of Kyiv National Linguistic University. Series Philology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32589/2311-0821.2.2022.274929\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MESSENGER of Kyiv National Linguistic University. Series Philology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32589/2311-0821.2.2022.274929","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文解释了“翻译记忆”的概念，并将其定义为一种计算机数据库，其中表示了不同母语语篇的语段，以及这些语段在二语中的对等物。本文概述了计算机辅助翻译、机器翻译和平行语料库工具箱是翻译记忆库的主要类型。其中，计算机辅助翻译被认为是使用专门的计算机软件将母语文本翻译成第二语言的过程。因此，人为因素在计算机辅助翻译过程中起着最重要的作用之一，因为母语文本要经过三种处理:前编辑、中编辑和后编辑。从狭义上看，机器翻译是将文本从L1翻译成L2的过程，由计算机全部或部分完成;从广义上看，机器翻译是科学研究的一个分支，是语言学、数学和控制论的重点，旨在建立一个实现狭义机器翻译的系统。平行语料库工具箱是一套L1和L2文本的数据库，其中包含大量不同话语、问题和主题的文本。此外，OPUS语料库工具箱作为翻译记忆库类型之一，保证了智能翻译过程的效率，是目前公共领域免费的语料库系统，包含了从L1、L2到L3的文本语料库。从众多的互联网资源，并不断更新。OPUS语料库工具的资源能力经过测试，以电影语篇的一、二、三成分语篇片段为例，验证了其在一、二、三成分语篇结构验证过程中的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

OPUS corpus toolkit for ensuring intelligent translation (case study of L1 and L2 texts of English-Ukrainian film discourse)

The article explains the concept of “translation memory” and defines it as a computer database where segments of texts of different L1 discourses are represented, as well as equivalents of these segments in L2. Computer-Aided Translation, Machine Translation and Parallel corpus toolkit are outlined as the main types of translation memory. In particular, Computer-Aided Translation is considered as the process of translating L1 text to obtain L2 by using specialized computer software. In this way, the human factor plays one of the most important missions in the process of performing Computer-Aided Translation, because the L1 text is subjected to three types of processing: pre-, inter- and post-editing. Machine Translation is viewed in a narrow sense as the process of translating a text from L1 to L2, that is performed by a computer in whole and/or in part, and in a broad sense as a branch of scientific research, that is in the focus of Linguistics, Mathematics and Cybernetics, and aims to build a system that implements Machine Translation in the narrow sense of this concept. Parallel corpus toolkit is a database with a set of L1 and L2 texts, that contains a large number of texts of different discourses, issues and topics. In addition, the attention is paid to the OPUS corpus toolkit as one of the translation memory types, which ensures the efficiency of the process of intelligent translation and is currently a free corpus system in the public domain, which contains corpora of texts from L1 and L2 to L3...Ln from numerous Internet resources and is constantly updated. The tested resource capabilities of the OPUS corpus tool have proved their effectiveness in the process of verification of one-, two-, and three-component L2 lexical constructs on the example of L1 and L2 text fragments belonging to film discourse.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

MESSENGER of Kyiv National Linguistic University. Series Philology

自引率

0.00%

发文量