Mark Fishel, Yota Georgakopoulou, Sergio Penkale, V. Petukhova, M. Rojc, M. Volk, Andy Way
{"title":"From Subtitles to Parallel Corpora","authors":"Mark Fishel, Yota Georgakopoulou, Sergio Penkale, V. Petukhova, M. Rojc, M. Volk, Andy Way","doi":"10.5167/UZH-63327","DOIUrl":"https://doi.org/10.5167/UZH-63327","url":null,"abstract":"We describe the preparation of parallel corpora based on professional quality subtitles in seven European language pairs. The main focus is the effect of the processing steps on the size and quality of the final corpora.","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129957476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mixture-Modeling with Unsupervised Clusters for Domain Adaptation in Statistical Machine Translation","authors":"Rico Sennrich","doi":"10.5167/UZH-62826","DOIUrl":"https://doi.org/10.5167/UZH-62826","url":null,"abstract":"In Statistical Machine Translation, in-domain and out-of-domain training data are not always clearly delineated. This paper investigates how we can still use mixture-modeling techniques for domain adaptation in such cases. We apply unsupervised clustering methods to split the original training set, and then use mixture-modeling techniques to build a model adapted to a given target domain. We show that this approach improves performance over an unadapted baseline, and several alternative domain adaptation methods.","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"1230 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126050618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interlingual strategies in translation","authors":"P. E. Pause","doi":"10.1515/9783110802474.175","DOIUrl":"https://doi.org/10.1515/9783110802474.175","url":null,"abstract":"","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134528692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Description and acquisition of multiword lexemes","authors":"Angelika Storrer, Ulrike Schwall","doi":"10.1007/3-540-59040-4_19","DOIUrl":"https://doi.org/10.1007/3-540-59040-4_19","url":null,"abstract":"","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126201586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Bachut, Isabelle Duquennoy, R. L. Humphreys, Tita Kyriakopoulou, A. Monceaux, F. Namer, Jean-Michel Ombrouck, C. Perrey, Anne Poncet-Montange, M. Puerta, Caroline Raffy, Brigitte Roudaud, S. Sabbagh
{"title":"A generic lexical model","authors":"D. Bachut, Isabelle Duquennoy, R. L. Humphreys, Tita Kyriakopoulou, A. Monceaux, F. Namer, Jean-Michel Ombrouck, C. Perrey, Anne Poncet-Montange, M. Puerta, Caroline Raffy, Brigitte Roudaud, S. Sabbagh","doi":"10.1007/3-540-59040-4_26","DOIUrl":"https://doi.org/10.1007/3-540-59040-4_26","url":null,"abstract":"","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125754305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. McKeown, K. Parton, Nizar Habash, Gonzalo Iglesias, A. Gispert
{"title":"Can Automatic Post-Editing Make MT More Meaningful","authors":"K. McKeown, K. Parton, Nizar Habash, Gonzalo Iglesias, A. Gispert","doi":"10.7916/D80V8N4B","DOIUrl":"https://doi.org/10.7916/D80V8N4B","url":null,"abstract":"Automatic post-editors (APEs) enable the re-use of black box machine translation (MT) systems for a variety of tasks where different aspects of translation are important. In this paper, we describe APEs that target adequacy errors, a critical problem for tasks such as cross-lingual question-answering, and compare different approaches for post-editing: a rule-based system and a feedback approach that uses a computer in the loop to suggest improvements to the MT system. We test the APEs on two different MT systems and across two different genres. Human evaluation shows that the APEs significantly improve adequacy, regardless of approach, MT system or genre: 30-56% of the post-edited sentences have improved adequacy compared to the original MT.","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123461485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}