{"title":"A Corpus Based N-gram Hybrid Approach of Bengali to English Machine Translation","authors":"M. M. Rahman, Md Faisal Kabir, M. N. Huda","doi":"10.1109/ICCITECHN.2018.8631938","DOIUrl":null,"url":null,"abstract":"Machine translation means automatic translation which is performed using computer software. There are several approaches to machine translation, some of them need extensive linguistic knowledge while others require enormous statistical calculations. This paper presents a hybrid method, integrating corpus based approach and statistical approach for translating Bengali sentences into English with the help of N-gram language model. The corpus based method finds the corresponding target language translation of sentence fragments, selecting the best match text from the bilingual corpus to acquire knowledge while the N-gram model rearranges the sentence constituents to get an accurate translation without employing external linguistic rules. A variety of Bengali sentences, including various structures and verb tenses are considered to translate through the new system. The performance of the proposed system is evaluated in terms of adequacy, fluency, WER, and BLEU score. The assessment scores are compared with other conventional approaches as well as with Google Translate, a well-known free machine translation service by Google. It has been found that experimental results of the work provide higher scores over Google Translate and other methods with less computational cost.","PeriodicalId":355984,"journal":{"name":"2018 21st International Conference of Computer and Information Technology (ICCIT)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 21st International Conference of Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCITECHN.2018.8631938","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Machine translation means automatic translation which is performed using computer software. There are several approaches to machine translation, some of them need extensive linguistic knowledge while others require enormous statistical calculations. This paper presents a hybrid method, integrating corpus based approach and statistical approach for translating Bengali sentences into English with the help of N-gram language model. The corpus based method finds the corresponding target language translation of sentence fragments, selecting the best match text from the bilingual corpus to acquire knowledge while the N-gram model rearranges the sentence constituents to get an accurate translation without employing external linguistic rules. A variety of Bengali sentences, including various structures and verb tenses are considered to translate through the new system. The performance of the proposed system is evaluated in terms of adequacy, fluency, WER, and BLEU score. The assessment scores are compared with other conventional approaches as well as with Google Translate, a well-known free machine translation service by Google. It has been found that experimental results of the work provide higher scores over Google Translate and other methods with less computational cost.