{"title":"Using Self-Organizing Map and Data Mining Measurements to Improve Thai-English Statistical Machine Translation","authors":"Singha Wongdeethai, Jumpol Polvichai, N. Netjinda","doi":"10.1109/ICISA.2011.5772395","DOIUrl":null,"url":null,"abstract":"The objective of this work is improving for Statistical Machine (SMT) by using Self - Organizing MAP (SOM). In general we have 2 processes for Training and Translating. Training process is use for preparing resource from a number of bilingual corpuses, which are used for translating process. But, we still have a lot of irrelevant resource of data. Major method for this research is highlighted on new SOM Method for filtering on irrelevant data off from final translation model as much as possible. The initial result identify that using SOM for filtering process is able to filtering out incorrect pairing more efficient than general statistical method. Hence, the better statistical translation model can be created. In assumption, the efficiency of Thai-English SMT could be improved from using this improve statistical model.","PeriodicalId":425210,"journal":{"name":"2011 International Conference on Information Science and Applications","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Information Science and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICISA.2011.5772395","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The objective of this work is improving for Statistical Machine (SMT) by using Self - Organizing MAP (SOM). In general we have 2 processes for Training and Translating. Training process is use for preparing resource from a number of bilingual corpuses, which are used for translating process. But, we still have a lot of irrelevant resource of data. Major method for this research is highlighted on new SOM Method for filtering on irrelevant data off from final translation model as much as possible. The initial result identify that using SOM for filtering process is able to filtering out incorrect pairing more efficient than general statistical method. Hence, the better statistical translation model can be created. In assumption, the efficiency of Thai-English SMT could be improved from using this improve statistical model.