M. Mridha, Md. Mashod Rana, Md. Abdul Hamid, Md. Eyaseen Arafat Khan, Md. Masud Ahmed, Mohammad Tipu Sultan
{"title":"孟加拉语句子缺词检测与纠错方法研究","authors":"M. Mridha, Md. Mashod Rana, Md. Abdul Hamid, Md. Eyaseen Arafat Khan, Md. Masud Ahmed, Mohammad Tipu Sultan","doi":"10.1109/ECACE.2019.8679416","DOIUrl":null,"url":null,"abstract":"Auto-correction for missing word in a sentence is not so easy. Also, it is found more challenging for the Bengali language. Our rigorous study reveals the fact that no significant research works have been done for the Bengali Language on this very topic. In this paper, we proposed a method that can detect the missing word and provide a suggestion list correspond to missed word with 82.82% accuracy. We have used n-gram model to find whether a word is missing between two words from a sentence or not. Then, we have used probability scoring to rank the suggestion list after finding the probable words for the missed word. We have used a corpus for making the decision which is the collection of bigram and another corpus is used for preferable word for missed word which is a collection of the trigram. Finally, we have used another six corpora to evaluate our proposed method. All corpora are created by us using the data collected from the web.","PeriodicalId":226060,"journal":{"name":"2019 International Conference on Electrical, Computer and Communication Engineering (ECCE)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"An Approach for Detection and Correction of Missing Word in Bengali Sentence\",\"authors\":\"M. Mridha, Md. Mashod Rana, Md. Abdul Hamid, Md. Eyaseen Arafat Khan, Md. Masud Ahmed, Mohammad Tipu Sultan\",\"doi\":\"10.1109/ECACE.2019.8679416\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Auto-correction for missing word in a sentence is not so easy. Also, it is found more challenging for the Bengali language. Our rigorous study reveals the fact that no significant research works have been done for the Bengali Language on this very topic. In this paper, we proposed a method that can detect the missing word and provide a suggestion list correspond to missed word with 82.82% accuracy. We have used n-gram model to find whether a word is missing between two words from a sentence or not. Then, we have used probability scoring to rank the suggestion list after finding the probable words for the missed word. We have used a corpus for making the decision which is the collection of bigram and another corpus is used for preferable word for missed word which is a collection of the trigram. Finally, we have used another six corpora to evaluate our proposed method. All corpora are created by us using the data collected from the web.\",\"PeriodicalId\":226060,\"journal\":{\"name\":\"2019 International Conference on Electrical, Computer and Communication Engineering (ECCE)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Electrical, Computer and Communication Engineering (ECCE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ECACE.2019.8679416\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Electrical, Computer and Communication Engineering (ECCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ECACE.2019.8679416","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Approach for Detection and Correction of Missing Word in Bengali Sentence
Auto-correction for missing word in a sentence is not so easy. Also, it is found more challenging for the Bengali language. Our rigorous study reveals the fact that no significant research works have been done for the Bengali Language on this very topic. In this paper, we proposed a method that can detect the missing word and provide a suggestion list correspond to missed word with 82.82% accuracy. We have used n-gram model to find whether a word is missing between two words from a sentence or not. Then, we have used probability scoring to rank the suggestion list after finding the probable words for the missed word. We have used a corpus for making the decision which is the collection of bigram and another corpus is used for preferable word for missed word which is a collection of the trigram. Finally, we have used another six corpora to evaluate our proposed method. All corpora are created by us using the data collected from the web.