{"title":"Denoising Sequence-to-Sequence Modeling for Removing Spelling Mistakes","authors":"Shuvendu Roy","doi":"10.1109/ICASERT.2019.8934902","DOIUrl":null,"url":null,"abstract":"Rule-based spelling correction system focused on finding the most matched word with the misspelled word. But this approach does not work well inside a sentence with multiple errors that has a combination of possible correct words to replace but only one current sentence. Replacing each word individually will result in errors. So, the spelling corrector system must understand the context of the sentence including the tense and gender of the subject and so on. The most popular example of typing mistake correction is the one Google provides in their search engine. It was introduced quite a while ago but no such good performing system is developed by anyone else. In this work, we have proposed a spelling correction system using deep learning. The basic intuition of our approach is taken from denoising autoencoder. Here we have trained the model with noisy input generated by changing, removing or adding extra character at random position inside the sequence. The job of the model is to model this noisy input to output the original errorless sequence. We have experimented with large English dataset and reported the performance in terms of character level accuracy. The proposed model has shown impressive results in correcting the spelling mistakes.","PeriodicalId":6613,"journal":{"name":"2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT)","volume":"58 1","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASERT.2019.8934902","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Rule-based spelling correction system focused on finding the most matched word with the misspelled word. But this approach does not work well inside a sentence with multiple errors that has a combination of possible correct words to replace but only one current sentence. Replacing each word individually will result in errors. So, the spelling corrector system must understand the context of the sentence including the tense and gender of the subject and so on. The most popular example of typing mistake correction is the one Google provides in their search engine. It was introduced quite a while ago but no such good performing system is developed by anyone else. In this work, we have proposed a spelling correction system using deep learning. The basic intuition of our approach is taken from denoising autoencoder. Here we have trained the model with noisy input generated by changing, removing or adding extra character at random position inside the sequence. The job of the model is to model this noisy input to output the original errorless sequence. We have experimented with large English dataset and reported the performance in terms of character level accuracy. The proposed model has shown impressive results in correcting the spelling mistakes.