{"title":"Refining lexical translation training scheme for improving the quality of statistical phrase-based translation","authors":"Cuong Hoang, C. Le, S. Pham","doi":"10.1145/2350716.2350727","DOIUrl":null,"url":null,"abstract":"Under word-based alignment, frequent words with consistent translations can be aligned at a high rate of precision. However, the words that are less frequent or exhibit diverse translations in training corpora generally do not have statistically significant evidences for confident alignments [7]. In this work, we will focus on proposing a bootstrapping algorithm to capture those less frequent or exhibit diverse alignments. Interestingly, we avoid making any explicit assumption concerning with the pair of languages used. As the result, we take the experimental evaluations on two phrase-based translation systems: the English-Vietnamese and English-French translation systems. Experiments point out a significant \"boosting\" capacity for the quality in overall for both these tasks.","PeriodicalId":208300,"journal":{"name":"Proceedings of the 3rd Symposium on Information and Communication Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd Symposium on Information and Communication Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2350716.2350727","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Under word-based alignment, frequent words with consistent translations can be aligned at a high rate of precision. However, the words that are less frequent or exhibit diverse translations in training corpora generally do not have statistically significant evidences for confident alignments [7]. In this work, we will focus on proposing a bootstrapping algorithm to capture those less frequent or exhibit diverse alignments. Interestingly, we avoid making any explicit assumption concerning with the pair of languages used. As the result, we take the experimental evaluations on two phrase-based translation systems: the English-Vietnamese and English-French translation systems. Experiments point out a significant "boosting" capacity for the quality in overall for both these tasks.