{"title":"A Comparative Approach to Naïve Bayes Classifier and Support Vector Machine for Email Spam Classification","authors":"Thae Ma Ma, K. Yamamori, A. Thida","doi":"10.1109/GCCE50665.2020.9291921","DOIUrl":null,"url":null,"abstract":"Spam or unsolicited emails that are used by spammers can cause huge loss to both the email users and the email server. Therefore, in order to detect spam emails not to enter into our mailbox, a developed email spam classification system is required. This paper proposes two popular machine learning methods, Naïve Bayes Classifier and Support Vector Machine, to classify the emails into spam or ham based on the body or content of the emails. In Naïve Bayes Classifier, independent words are considered as features. Support Vector Machine can be used to represent an email in vector space in which each feature means one dimension. Finally, two methods are compared in terms of precision, recall, F-measure performance metrics with the aim of finding the best method.","PeriodicalId":179456,"journal":{"name":"2020 IEEE 9th Global Conference on Consumer Electronics (GCCE)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 9th Global Conference on Consumer Electronics (GCCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GCCE50665.2020.9291921","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18
Abstract
Spam or unsolicited emails that are used by spammers can cause huge loss to both the email users and the email server. Therefore, in order to detect spam emails not to enter into our mailbox, a developed email spam classification system is required. This paper proposes two popular machine learning methods, Naïve Bayes Classifier and Support Vector Machine, to classify the emails into spam or ham based on the body or content of the emails. In Naïve Bayes Classifier, independent words are considered as features. Support Vector Machine can be used to represent an email in vector space in which each feature means one dimension. Finally, two methods are compared in terms of precision, recall, F-measure performance metrics with the aim of finding the best method.