Marija Brkic Bakaric, Bozena Basic Mikulic, M. Matetić
{"title":"Can we beat google translate?","authors":"Marija Brkic Bakaric, Bozena Basic Mikulic, M. Matetić","doi":"10.2498/iti.2012.0411","DOIUrl":null,"url":null,"abstract":"This paper presents a machine translation evaluation study for Croatian-English language pair. In-domain and out-of-domain translations from Croatian into English have been obtained from Google Translate, our own statistical machine translation system LegTran, and from a professional translator. These translations have been evaluated by six different automatic metrics. The gains obtained from increasing the number of reference translations have been explored and measured. System level correlation between automatic evaluation metrics is given and the significance of the results is discussed. Bootstrapping, approximate randomization and the sign test have been used for confidence intervals and hypothesis testing.","PeriodicalId":135105,"journal":{"name":"Proceedings of the ITI 2012 34th International Conference on Information Technology Interfaces","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ITI 2012 34th International Conference on Information Technology Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2498/iti.2012.0411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This paper presents a machine translation evaluation study for Croatian-English language pair. In-domain and out-of-domain translations from Croatian into English have been obtained from Google Translate, our own statistical machine translation system LegTran, and from a professional translator. These translations have been evaluated by six different automatic metrics. The gains obtained from increasing the number of reference translations have been explored and measured. System level correlation between automatic evaluation metrics is given and the significance of the results is discussed. Bootstrapping, approximate randomization and the sign test have been used for confidence intervals and hypothesis testing.