阿英平行语料库:翻译培训和语言教学的新资源

AARN: Applied Linguistic Anthropology (Topic) Pub Date : 2017-09-15 DOI:10.2139/ssrn.3053572

H. Alotaibi

{"title":"阿英平行语料库:翻译培训和语言教学的新资源","authors":"H. Alotaibi","doi":"10.2139/ssrn.3053572","DOIUrl":null,"url":null,"abstract":"Parallel corpora can be defined as collections of aligned, translated texts of two or more languages. They play a major role in translation and contrastive studies, and are also becoming popular in translation training and language teaching, with the advent of the data-driven learning (DDL) approach. Despite their significance, however, Arabic seems to lack a satisfactory general-use parallel corpus resource. The literature describes few Arabic–English parallel corpora, and these few are usually inaccurate and/or expensive. Some are small in size, while others are restricted in terms of genre, failing to meet the requirements of many academics and researchers. This paper describes an ongoing project at the College of Languages and Translation, King Saud University, to compile a 10-million-word Arabic–English parallel corpus to be used as a resource for translation training and language teaching. The bidirectional corpus can be used to compare translated and source language and identify differences. The corpus has been manually verified at different stages, including translation, text segmentation, alignment, and file preparation; it is available as full-text in XML format and through a user-friendly web interface that provides a concordancer to support bilingual search queries and several filtering options.","PeriodicalId":325888,"journal":{"name":"AARN: Applied Linguistic Anthropology (Topic)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"53","resultStr":"{\"title\":\"Arabic-English Parallel Corpus: A New Resource for Translation Training and Language Teaching\",\"authors\":\"H. Alotaibi\",\"doi\":\"10.2139/ssrn.3053572\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Parallel corpora can be defined as collections of aligned, translated texts of two or more languages. They play a major role in translation and contrastive studies, and are also becoming popular in translation training and language teaching, with the advent of the data-driven learning (DDL) approach. Despite their significance, however, Arabic seems to lack a satisfactory general-use parallel corpus resource. The literature describes few Arabic–English parallel corpora, and these few are usually inaccurate and/or expensive. Some are small in size, while others are restricted in terms of genre, failing to meet the requirements of many academics and researchers. This paper describes an ongoing project at the College of Languages and Translation, King Saud University, to compile a 10-million-word Arabic–English parallel corpus to be used as a resource for translation training and language teaching. The bidirectional corpus can be used to compare translated and source language and identify differences. The corpus has been manually verified at different stages, including translation, text segmentation, alignment, and file preparation; it is available as full-text in XML format and through a user-friendly web interface that provides a concordancer to support bilingual search queries and several filtering options.\",\"PeriodicalId\":325888,\"journal\":{\"name\":\"AARN: Applied Linguistic Anthropology (Topic)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"53\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AARN: Applied Linguistic Anthropology (Topic)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2139/ssrn.3053572\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AARN: Applied Linguistic Anthropology (Topic)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3053572","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 53

摘要

平行语料库可以定义为两种或两种以上语言的对齐翻译文本的集合。它们在翻译和对比研究中发挥着重要作用，随着数据驱动学习(DDL)方法的出现，它们在翻译培训和语言教学中也越来越受欢迎。然而，尽管它们具有重要意义，阿拉伯文似乎缺乏令人满意的通用并行语料库资源。文献中描述的阿拉伯语-英语平行语料库很少，而且这些很少的语料库通常是不准确和/或昂贵的。有的规模小，有的体裁受限制，不能满足许多学者和研究人员的要求。本文描述了沙特国王大学语言与翻译学院正在进行的一个项目，即编写一个1000万字的阿拉伯语-英语平行语料库，作为翻译培训和语言教学的资源。双向语料库可以用来比较翻译语言和源语言，识别差异。语料库在不同阶段进行了人工验证，包括翻译、文本分割、对齐和文件准备;它以XML格式的全文提供，并通过用户友好的web界面提供了一个索引器，以支持双语搜索查询和几个过滤选项。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Arabic-English Parallel Corpus: A New Resource for Translation Training and Language Teaching

Parallel corpora can be defined as collections of aligned, translated texts of two or more languages. They play a major role in translation and contrastive studies, and are also becoming popular in translation training and language teaching, with the advent of the data-driven learning (DDL) approach. Despite their significance, however, Arabic seems to lack a satisfactory general-use parallel corpus resource. The literature describes few Arabic–English parallel corpora, and these few are usually inaccurate and/or expensive. Some are small in size, while others are restricted in terms of genre, failing to meet the requirements of many academics and researchers. This paper describes an ongoing project at the College of Languages and Translation, King Saud University, to compile a 10-million-word Arabic–English parallel corpus to be used as a resource for translation training and language teaching. The bidirectional corpus can be used to compare translated and source language and identify differences. The corpus has been manually verified at different stages, including translation, text segmentation, alignment, and file preparation; it is available as full-text in XML format and through a user-friendly web interface that provides a concordancer to support bilingual search queries and several filtering options.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

AARN: Applied Linguistic Anthropology (Topic)

自引率

0.00%

发文量