{"title":"Naïve贝叶斯分类器与支持向量机在垃圾邮件分类中的比较","authors":"Thae Ma Ma, K. Yamamori, A. Thida","doi":"10.1109/GCCE50665.2020.9291921","DOIUrl":null,"url":null,"abstract":"Spam or unsolicited emails that are used by spammers can cause huge loss to both the email users and the email server. Therefore, in order to detect spam emails not to enter into our mailbox, a developed email spam classification system is required. This paper proposes two popular machine learning methods, Naïve Bayes Classifier and Support Vector Machine, to classify the emails into spam or ham based on the body or content of the emails. In Naïve Bayes Classifier, independent words are considered as features. Support Vector Machine can be used to represent an email in vector space in which each feature means one dimension. Finally, two methods are compared in terms of precision, recall, F-measure performance metrics with the aim of finding the best method.","PeriodicalId":179456,"journal":{"name":"2020 IEEE 9th Global Conference on Consumer Electronics (GCCE)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"A Comparative Approach to Naïve Bayes Classifier and Support Vector Machine for Email Spam Classification\",\"authors\":\"Thae Ma Ma, K. Yamamori, A. Thida\",\"doi\":\"10.1109/GCCE50665.2020.9291921\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spam or unsolicited emails that are used by spammers can cause huge loss to both the email users and the email server. Therefore, in order to detect spam emails not to enter into our mailbox, a developed email spam classification system is required. This paper proposes two popular machine learning methods, Naïve Bayes Classifier and Support Vector Machine, to classify the emails into spam or ham based on the body or content of the emails. In Naïve Bayes Classifier, independent words are considered as features. Support Vector Machine can be used to represent an email in vector space in which each feature means one dimension. Finally, two methods are compared in terms of precision, recall, F-measure performance metrics with the aim of finding the best method.\",\"PeriodicalId\":179456,\"journal\":{\"name\":\"2020 IEEE 9th Global Conference on Consumer Electronics (GCCE)\",\"volume\":\"82 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 9th Global Conference on Consumer Electronics (GCCE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GCCE50665.2020.9291921\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 9th Global Conference on Consumer Electronics (GCCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GCCE50665.2020.9291921","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Comparative Approach to Naïve Bayes Classifier and Support Vector Machine for Email Spam Classification
Spam or unsolicited emails that are used by spammers can cause huge loss to both the email users and the email server. Therefore, in order to detect spam emails not to enter into our mailbox, a developed email spam classification system is required. This paper proposes two popular machine learning methods, Naïve Bayes Classifier and Support Vector Machine, to classify the emails into spam or ham based on the body or content of the emails. In Naïve Bayes Classifier, independent words are considered as features. Support Vector Machine can be used to represent an email in vector space in which each feature means one dimension. Finally, two methods are compared in terms of precision, recall, F-measure performance metrics with the aim of finding the best method.