{"title":"消息过滤系统中社交媒体数据集的拼写校正器","authors":"Zar Zar Wint, Theo Ducros, M. Aritsugi","doi":"10.1109/ICDIM.2017.8244677","DOIUrl":null,"url":null,"abstract":"We develop a spell checker and corrector to check word errors in the social media datasets, which will be used in message filtering systems specially for cyberbullying detection. We use the dictionary techniques to check words and there are ten word spell error checking and correction approaches. If there are more than one corrected word we get from each approach, we use Levenshtein distance to choose the corrected word from the words in the dictionary. The spell correction results were around 90%. Moreover the percentage of each approach highlighted the efficiency of adding letters in the word.","PeriodicalId":144953,"journal":{"name":"2017 Twelfth International Conference on Digital Information Management (ICDIM)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Spell corrector to social media datasets in message filtering systems\",\"authors\":\"Zar Zar Wint, Theo Ducros, M. Aritsugi\",\"doi\":\"10.1109/ICDIM.2017.8244677\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We develop a spell checker and corrector to check word errors in the social media datasets, which will be used in message filtering systems specially for cyberbullying detection. We use the dictionary techniques to check words and there are ten word spell error checking and correction approaches. If there are more than one corrected word we get from each approach, we use Levenshtein distance to choose the corrected word from the words in the dictionary. The spell correction results were around 90%. Moreover the percentage of each approach highlighted the efficiency of adding letters in the word.\",\"PeriodicalId\":144953,\"journal\":{\"name\":\"2017 Twelfth International Conference on Digital Information Management (ICDIM)\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 Twelfth International Conference on Digital Information Management (ICDIM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDIM.2017.8244677\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Twelfth International Conference on Digital Information Management (ICDIM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDIM.2017.8244677","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Spell corrector to social media datasets in message filtering systems
We develop a spell checker and corrector to check word errors in the social media datasets, which will be used in message filtering systems specially for cyberbullying detection. We use the dictionary techniques to check words and there are ten word spell error checking and correction approaches. If there are more than one corrected word we get from each approach, we use Levenshtein distance to choose the corrected word from the words in the dictionary. The spell correction results were around 90%. Moreover the percentage of each approach highlighted the efficiency of adding letters in the word.