{"title":"Possibility Theory-Based Approach to Spam Email Detection","authors":"D. Tran, Wanli Ma, D. Sharma, Thien Huu Nguyen","doi":"10.1109/GrC.2007.123","DOIUrl":null,"url":null,"abstract":"Most of current spam email detection systems use keywords in a blacklist to detect spam emails. However these keywords can be written as misspellings, for example \"baank\", \"ba-nk\" and \"bankk\" instead of \"bank\". Moreover, misspellings are changed from time to time and hence spam email detection system needs to constantly update the blacklist to detect spam emails containing such misspellings. However it is impossible to predict all possible misspellings for a given keyword to add those to the blacklist. We present a possibility theory-based approach to spam email detection to solve this problem. We consider every keyword in the blacklist along with its misspellings as a fuzzy set and propose a possibility function. This function will be used to calculate a possibility score for an unknown email. Using a proposed if-then rule and this core, we can decide whether or not this unknown email is spam. Experimental results are also presented.","PeriodicalId":259430,"journal":{"name":"2007 IEEE International Conference on Granular Computing (GRC 2007)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE International Conference on Granular Computing (GRC 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GrC.2007.123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Most of current spam email detection systems use keywords in a blacklist to detect spam emails. However these keywords can be written as misspellings, for example "baank", "ba-nk" and "bankk" instead of "bank". Moreover, misspellings are changed from time to time and hence spam email detection system needs to constantly update the blacklist to detect spam emails containing such misspellings. However it is impossible to predict all possible misspellings for a given keyword to add those to the blacklist. We present a possibility theory-based approach to spam email detection to solve this problem. We consider every keyword in the blacklist along with its misspellings as a fuzzy set and propose a possibility function. This function will be used to calculate a possibility score for an unknown email. Using a proposed if-then rule and this core, we can decide whether or not this unknown email is spam. Experimental results are also presented.