{"title":"英越英统计机器翻译中词义消歧方法研究","authors":"Quy T. Nguyen, An Nguyen, Dinh Dien","doi":"10.1109/rivf.2012.6169839","DOIUrl":null,"url":null,"abstract":"The most difficult problem of machine translation (MT) in general and statistical machine translation (SMT) in particular is to select the correct meaning of the polysemous words. Their correct meaning mainly depends on the context and the topic of the text. Therefore, to improve the quality of SMT by resolving semantic ambiguity of words, we integrate more knowledge about the topic of the text, part-of-speech (POS) and morphology. We applied this model to English-Vietnamese- English SMT system and BLEU scores increased over 6% compared with the baseline general SMT system, which was not integrated information about the topic or other language knowledge.","PeriodicalId":115212,"journal":{"name":"2012 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"An Approach to Word Sense Disambiguation in English-Vietnamese-English Statistical Machine Translation\",\"authors\":\"Quy T. Nguyen, An Nguyen, Dinh Dien\",\"doi\":\"10.1109/rivf.2012.6169839\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The most difficult problem of machine translation (MT) in general and statistical machine translation (SMT) in particular is to select the correct meaning of the polysemous words. Their correct meaning mainly depends on the context and the topic of the text. Therefore, to improve the quality of SMT by resolving semantic ambiguity of words, we integrate more knowledge about the topic of the text, part-of-speech (POS) and morphology. We applied this model to English-Vietnamese- English SMT system and BLEU scores increased over 6% compared with the baseline general SMT system, which was not integrated information about the topic or other language knowledge.\",\"PeriodicalId\":115212,\"journal\":{\"name\":\"2012 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-03-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/rivf.2012.6169839\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/rivf.2012.6169839","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Approach to Word Sense Disambiguation in English-Vietnamese-English Statistical Machine Translation
The most difficult problem of machine translation (MT) in general and statistical machine translation (SMT) in particular is to select the correct meaning of the polysemous words. Their correct meaning mainly depends on the context and the topic of the text. Therefore, to improve the quality of SMT by resolving semantic ambiguity of words, we integrate more knowledge about the topic of the text, part-of-speech (POS) and morphology. We applied this model to English-Vietnamese- English SMT system and BLEU scores increased over 6% compared with the baseline general SMT system, which was not integrated information about the topic or other language knowledge.