{"title":"Ham or spam? A comparative study for some content-based classification algorithms for email filtering","authors":"Salwa Adriana Saab, Nicholas Mitri, M. Awad","doi":"10.1109/MELCON.2014.6820574","DOIUrl":null,"url":null,"abstract":"Spam emails are widely spreading to constitute a significant share of everyone's daily inbox. Being a source of financial loss and inconvenience for the recipients, spam emails have to be filtered and separated from legitimate ones. This paper presents a survey of some popular filtering algorithms that rely on text classification to decide whether an email is unsolicited or not. A comparison among them is performed on the SpamBase dataset to identify the best classification algorithm in terms of accuracy, computational time, and precision/recall rates.","PeriodicalId":103316,"journal":{"name":"MELECON 2014 - 2014 17th IEEE Mediterranean Electrotechnical Conference","volume":"320 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MELECON 2014 - 2014 17th IEEE Mediterranean Electrotechnical Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MELCON.2014.6820574","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 29
Abstract
Spam emails are widely spreading to constitute a significant share of everyone's daily inbox. Being a source of financial loss and inconvenience for the recipients, spam emails have to be filtered and separated from legitimate ones. This paper presents a survey of some popular filtering algorithms that rely on text classification to decide whether an email is unsolicited or not. A comparison among them is performed on the SpamBase dataset to identify the best classification algorithm in terms of accuracy, computational time, and precision/recall rates.