{"title":"Exploring Adequacy Errors in Neural Machine Translation with the Help of Cross-Language Aligned Word Embeddings","authors":"M. Ustaszewski","doi":"10.26615/issn.2683-0078.2019_015","DOIUrl":"https://doi.org/10.26615/issn.2683-0078.2019_015","url":null,"abstract":"Neural machine translation (NMT) was shown to produce more fluent output than phrase-based statistical (PBMT) and rule-based machine translation (RBMT). However, improved fluency makes it more difficult for post editors to identify and correct adequacy errors, because unlike RBMT and SMT, in NMT adequacy errors are frequently not anticipated by fluency errors. Omissions and additions of content in otherwise flawlessly fluent NMT output are the most prominent types of such adequacy errors, which can only be detected with reference to source texts. This contribution explores the degree of semantic similarity between source texts, NMT output and post edited output. In this way, computational semantic similarity scores (cosine similarity) are related to human quality judgments. The analyses are based on publicly available NMT post editing data annotated for errors in three language pairs (EN-DE, EN-LV, EN-HR) with the Multidimensional Quality Metrics (MQM). Methodologically, this contribution tests whether cross-language aligned word embeddings as the sole source of semantic information mirror human error annotation.","PeriodicalId":313947,"journal":{"name":"Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127135461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Human Evaluation of Neural Machine Translation: The Case of Deep Learning","authors":"Marie Escribe","doi":"10.26615/issn.2683-0078.2019_005","DOIUrl":"https://doi.org/10.26615/issn.2683-0078.2019_005","url":null,"abstract":"Recent advances in artificial neural networks now have a great impact on translation technology. A considerable achievement was reached in this field with the publication of L’Apprentissage Profond. This book, originally written in English (Deep Learning), was entirely machine-translated into French and post-edited by several experts. In this context, it appears essential to have a clear vision of the performance of MT tools. Providing an evaluation of NMT is precisely the aim of the present research paper. To accomplish this objective, a framework for error categorisation was built and a comparative analysis of the raw translation output and the post-edited version was performed with the purpose of identifying recurring patterns of errors. The findings showed that even though some grammatical errors were spotted, the output was generally correct from a linguistic point of view. The most recurring errors are linked to the specialised terminology employed in this book. Further errors include parts of text that were not translated as well as edits based on stylistic preferences. The major part of the output was not acceptable as such and required several edits per segment, but some sentences were of publishable quality and were therefore left untouched in the final version.","PeriodicalId":313947,"journal":{"name":"Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124255012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimising the Machine Translation Post-editing Workflow","authors":"A. Zaretskaya","doi":"10.26615/issn.2683-0078.2019_018","DOIUrl":"https://doi.org/10.26615/issn.2683-0078.2019_018","url":null,"abstract":"In this article, we describe how machine translation is used for post-editing at TransPerfect and the ways in which we optimise the workflow. This includes MT evaluation, MT engine customisation, leveraging MT suggestions compared to TM matches, and the lessons learnt from implementing MT at a large scale.","PeriodicalId":313947,"journal":{"name":"Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131417431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Irina Temnikova, Ahmed Abdelali, Souhila Djabri, S. Hedaya
{"title":"Human-Informed Speakers and Interpreters Analysis in the WAW Corpus and an Automatic Method for Calculating Interpreters’ Décalage","authors":"Irina Temnikova, Ahmed Abdelali, Souhila Djabri, S. Hedaya","doi":"10.26615/issn.2683-0078.2019_013","DOIUrl":"https://doi.org/10.26615/issn.2683-0078.2019_013","url":null,"abstract":"This article presents a multi-faceted analysis of a subset of interpreted conference speeches from the WAW corpus for the English-Arabic language pair. We analyze several speakers and interpreters variables via manual annotation and automatic methods. We propose a new automatic method for calculating interpreters’ décalage based on Automatic Speech Recognition (ASR) and automatic alignment of named entities and content words between speaker and interpreter. The method is evaluated by two human annotators who have expertise in interpreting and Interpreting Studies and shows highly satisfactory results, accompanied with a high inter-annotator agreement. We provide insights about the relations of speakers’ variables, interpreters’ variables and décalage and discuss them from Interpreting Studies and interpreting practice point of view. We had interesting findings about interpreters behavior which need to be extended to a large number of conference sessions in our future research.","PeriodicalId":313947,"journal":{"name":"Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134151661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Chinese/English Political Interpreting Corpus (CEPIC): A New Electronic Resource for Translators and Interpreters","authors":"Jun Pan","doi":"10.26615/issn.2683-0078.2019_010","DOIUrl":"https://doi.org/10.26615/issn.2683-0078.2019_010","url":null,"abstract":"The Chinese/English Political Interpreting Corpus (CEPIC) is a new electronic and open access resource developed for translators and interpreters, especially those working with political text types. Over 6 million word tokens in size, the online corpus consists of transcripts of Chinese (Cantonese & Putonghua) / English political speeches and their translated and interpreted texts. It includes rich meta-data and is POS-tagged and annotated with prosodic and paralinguistic features that are of concern to spoken language and interpreting. The online platform of the CEPIC features main functions including Keyword Search, Word Collocation and Expanded Keyword in Context, which are illustrated in the paper. The CEPIC can shed light on online translation and interpreting corpora development in the future.","PeriodicalId":313947,"journal":{"name":"Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019","volume":"85 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116417057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Four Stages of Machine Translation Acceptance in a Freelancer’s Life","authors":"Maria Sgourou","doi":"10.26615/issn.2683-0078.2019_017","DOIUrl":"https://doi.org/10.26615/issn.2683-0078.2019_017","url":null,"abstract":"Technology is a big challenge and raises many questions and issues when it comes to its application in the translation process, but translation’s biggest problem is not technology; it is rather how technology is perceived by translators. MT developers and researchers should take into account this perception and move towards a more democratized approach to include the base of the translation industry and perhaps its more valuable asset, the translators.","PeriodicalId":313947,"journal":{"name":"Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019","volume":"309 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122972696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}