Benjamin Ka-Yin T'sou, Ka-Po Chow, Junru Nie, Yuan Yuan, Hong Kong Chilin Ltd.
{"title":"Towards a Proactive MWE Terminological Platform for Cross-Lingual Mediation in the Age of Big Data","authors":"Benjamin Ka-Yin T'sou, Ka-Po Chow, Junru Nie, Yuan Yuan, Hong Kong Chilin Ltd.","doi":"10.26615/issn.2683-0078.2019_014","DOIUrl":null,"url":null,"abstract":"The emergence of China as a global economic power in the 21st Century has brought about surging needs for cross-lingual and cross-cultural mediation, typically performed by translators. Advances in Artificial Intelligence and Language Engineering have been bolstered by Machine learning and suitable Big Data cultivation. They have helped to meet some of the translator’s needs, though the technical specialists have not kept pace with the practical and expanding requirements in language mediation. One major technical and linguistic hurdle involves words outside the vocabulary of the translator or the lexical database he/she consults, especially Multi-Word Expressions (Compound Words) in technical subjects. A further problem is in the multiplicity of renditions of a term in the target language. This paper discusses a proactive approach following the successful extraction and application of sizable bilingual Multi-Word Expressions (Compound Words) for language mediation in technical subjects, which do not fall within the expertise of typical translators, who have inadequate appreciation of the range of new technical tools available to help him/her. Our approach draws on the personal reflections of translators and teachers of translation and is based on the prior R&D efforts relating to 300,000 comparable Chinese-English patents. The subsequent protocol we have developed aims to be proactive in meeting four identified practical challenges in technical translation (e.g. patents). It has broader economic implication in the Age of Big Data (Tsou et al, 2015) and Trade War, as the workload, if not, the challenges, increasingly cannot be met by currently available front-line translators. We shall demonstrate how new tools can be harnessed to spearhead the application of language technology not only in language mediation but also in the “teaching” and “learning” of translation. It shows how a better appreciation of their needs may enhance the contributions of the technical specialists, and thus enhance the resultant synergetic benefits.","PeriodicalId":313947,"journal":{"name":"Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26615/issn.2683-0078.2019_014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The emergence of China as a global economic power in the 21st Century has brought about surging needs for cross-lingual and cross-cultural mediation, typically performed by translators. Advances in Artificial Intelligence and Language Engineering have been bolstered by Machine learning and suitable Big Data cultivation. They have helped to meet some of the translator’s needs, though the technical specialists have not kept pace with the practical and expanding requirements in language mediation. One major technical and linguistic hurdle involves words outside the vocabulary of the translator or the lexical database he/she consults, especially Multi-Word Expressions (Compound Words) in technical subjects. A further problem is in the multiplicity of renditions of a term in the target language. This paper discusses a proactive approach following the successful extraction and application of sizable bilingual Multi-Word Expressions (Compound Words) for language mediation in technical subjects, which do not fall within the expertise of typical translators, who have inadequate appreciation of the range of new technical tools available to help him/her. Our approach draws on the personal reflections of translators and teachers of translation and is based on the prior R&D efforts relating to 300,000 comparable Chinese-English patents. The subsequent protocol we have developed aims to be proactive in meeting four identified practical challenges in technical translation (e.g. patents). It has broader economic implication in the Age of Big Data (Tsou et al, 2015) and Trade War, as the workload, if not, the challenges, increasingly cannot be met by currently available front-line translators. We shall demonstrate how new tools can be harnessed to spearhead the application of language technology not only in language mediation but also in the “teaching” and “learning” of translation. It shows how a better appreciation of their needs may enhance the contributions of the technical specialists, and thus enhance the resultant synergetic benefits.
中国作为21世纪的全球经济大国的崛起,带来了对跨语言和跨文化调解的激增需求,通常由翻译来完成。机器学习和适当的大数据培养促进了人工智能和语言工程的发展。他们帮助满足了翻译人员的一些需求,尽管技术专家没有跟上语言调解的实际和不断扩大的需求。一个主要的技术和语言障碍涉及翻译人员的词汇或他/她查阅的词汇数据库之外的词汇,特别是技术科目中的多词表达(复合词)。另一个问题是目标语言中术语的多重翻译。本文讨论了一种积极主动的方法,在成功提取和应用大量双语多词表达(复合词)后,用于技术主题的语言调解,这并不属于典型翻译人员的专业范围,他们对可用的新技术工具的范围没有充分的认识。我们的方法借鉴了翻译人员和翻译教师的个人反思,并基于之前与30万个可比中英文专利相关的研发工作。我们制定的后续协议旨在积极应对技术翻译(例如专利)中确定的四个实际挑战。它在大数据时代(Tsou et al, 2015)和贸易战中具有更广泛的经济含义,因为工作量,如果不是,挑战,越来越无法由目前可用的一线翻译人员来满足。我们将展示如何利用新的工具来引领语言技术的应用,不仅在语言调解中,而且在翻译的“教”和“学”中。它显示了如何更好地了解他们的需要可以提高技术专家的贡献,从而提高最终的协同效益。