{"title":"从语法增强同步解析到使用结构对齐的集成机器翻译","authors":"Bing Xiang, Bowen Zhou, Martin Cmejrek","doi":"10.1109/ASRU.2009.5372892","DOIUrl":null,"url":null,"abstract":"In current statistical machine translation, IBM model based word alignment is widely used as a starting point to build phrase-based machine translation systems. However, such alignment model is separated from the rest of machine translation pipeline and optimized independently. Furthermore, structural information is not taken into account in the alignment model, which sometimes leads to incorrect alignments. In this paper, we present a novel method to connect a re-alignment model with a translation model in an integrated framework. We conduct bilingual chart parsing based on syntax-augmented synchronous context-free grammar. A Viterbi derivation tree is generated for each sentence pair with multiple features employed in a log-linear model. A new word alignment is created under the structural constraint from the Viterbi tree. Extensive experiments are conducted in a Farsi-to-English translation task in conversational speech domain and also a German-to-English translation task in text domain. Systems trained on the new alignment provide significant higher BLEU scores compared to a state-of-the-art baseline.","PeriodicalId":292194,"journal":{"name":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards integrated machine translation using structural alignment from syntax-augmented synchronous parsing\",\"authors\":\"Bing Xiang, Bowen Zhou, Martin Cmejrek\",\"doi\":\"10.1109/ASRU.2009.5372892\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In current statistical machine translation, IBM model based word alignment is widely used as a starting point to build phrase-based machine translation systems. However, such alignment model is separated from the rest of machine translation pipeline and optimized independently. Furthermore, structural information is not taken into account in the alignment model, which sometimes leads to incorrect alignments. In this paper, we present a novel method to connect a re-alignment model with a translation model in an integrated framework. We conduct bilingual chart parsing based on syntax-augmented synchronous context-free grammar. A Viterbi derivation tree is generated for each sentence pair with multiple features employed in a log-linear model. A new word alignment is created under the structural constraint from the Viterbi tree. Extensive experiments are conducted in a Farsi-to-English translation task in conversational speech domain and also a German-to-English translation task in text domain. Systems trained on the new alignment provide significant higher BLEU scores compared to a state-of-the-art baseline.\",\"PeriodicalId\":292194,\"journal\":{\"name\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2009.5372892\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Workshop on Automatic Speech Recognition & Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2009.5372892","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards integrated machine translation using structural alignment from syntax-augmented synchronous parsing
In current statistical machine translation, IBM model based word alignment is widely used as a starting point to build phrase-based machine translation systems. However, such alignment model is separated from the rest of machine translation pipeline and optimized independently. Furthermore, structural information is not taken into account in the alignment model, which sometimes leads to incorrect alignments. In this paper, we present a novel method to connect a re-alignment model with a translation model in an integrated framework. We conduct bilingual chart parsing based on syntax-augmented synchronous context-free grammar. A Viterbi derivation tree is generated for each sentence pair with multiple features employed in a log-linear model. A new word alignment is created under the structural constraint from the Viterbi tree. Extensive experiments are conducted in a Farsi-to-English translation task in conversational speech domain and also a German-to-English translation task in text domain. Systems trained on the new alignment provide significant higher BLEU scores compared to a state-of-the-art baseline.