基于数据挖掘的垃圾邮件检测研究进展

Elifenesh Yitagesu Desta
{"title":"基于数据挖掘的垃圾邮件检测研究进展","authors":"Elifenesh Yitagesu Desta","doi":"10.7176/jiea/9-2-01","DOIUrl":null,"url":null,"abstract":"As we know email is an effective tool for communication and it is the fastest way to send information from one place to another and it saves time and also cost. But the email is affected by attacks which include spam mails. Spam is unwanted email or it is bulk data that is flooding the internet with many duplication of similar message, in an attempt to force the email on people who would not otherwise choose to receive it. To address the growing of spam email on the internet the interest of spam filtering also grow accordingly. In this paper we review various spam detection technics. We are use the technics with feature selection algorithm and without feature selection algorithm and apply all the classifier of data mining tool. In this study we analyze the classifier algorithm using two different data mining tools those are WEKA and TANAGRA. Data mining is the discovery of knowledge from the large database and it is the technique of finding out new patterns in a huge data sets. Both data mining tool use different classification algorithms like K-Nearest Neighbor (K-NN), Naive Bayes (NB) and others. Then finally, the best classifier for email spam is identified based on the accuracy of the algorithm on each data mining tools. Keywords : Classifier, Feature selection, Spam E-mail. DOI : 10.7176/JIEA/9-2-01 Publication date : April 30 th 2019","PeriodicalId":440930,"journal":{"name":"Journal of Information Engineering and Applications","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Spam Email Detection on Data Mining: A Review\",\"authors\":\"Elifenesh Yitagesu Desta\",\"doi\":\"10.7176/jiea/9-2-01\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As we know email is an effective tool for communication and it is the fastest way to send information from one place to another and it saves time and also cost. But the email is affected by attacks which include spam mails. Spam is unwanted email or it is bulk data that is flooding the internet with many duplication of similar message, in an attempt to force the email on people who would not otherwise choose to receive it. To address the growing of spam email on the internet the interest of spam filtering also grow accordingly. In this paper we review various spam detection technics. We are use the technics with feature selection algorithm and without feature selection algorithm and apply all the classifier of data mining tool. In this study we analyze the classifier algorithm using two different data mining tools those are WEKA and TANAGRA. Data mining is the discovery of knowledge from the large database and it is the technique of finding out new patterns in a huge data sets. Both data mining tool use different classification algorithms like K-Nearest Neighbor (K-NN), Naive Bayes (NB) and others. Then finally, the best classifier for email spam is identified based on the accuracy of the algorithm on each data mining tools. Keywords : Classifier, Feature selection, Spam E-mail. DOI : 10.7176/JIEA/9-2-01 Publication date : April 30 th 2019\",\"PeriodicalId\":440930,\"journal\":{\"name\":\"Journal of Information Engineering and Applications\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Information Engineering and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.7176/jiea/9-2-01\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Engineering and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7176/jiea/9-2-01","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

正如我们所知,电子邮件是一种有效的沟通工具,它是将信息从一个地方发送到另一个地方的最快方式,它节省了时间和成本。但是电子邮件受到了包括垃圾邮件在内的攻击。垃圾邮件指的是不需要的电子邮件,或者是大量数据,它们充斥着许多重复的类似信息,试图强迫那些本来不会选择接收电子邮件的人接收这些电子邮件。为了应对互联网上垃圾邮件的增长,垃圾邮件过滤的兴趣也随之增长。本文综述了各种垃圾邮件检测技术。我们采用了带特征选择算法和不带特征选择算法的技术,并应用了数据挖掘工具的所有分类器。在本研究中,我们使用两种不同的数据挖掘工具WEKA和TANAGRA来分析分类器算法。数据挖掘是从大型数据库中发现知识,是在庞大的数据集中发现新模式的技术。这两种数据挖掘工具使用不同的分类算法,如k -最近邻(K-NN)、朴素贝叶斯(NB)等。最后,根据算法在各数据挖掘工具上的准确率,识别出垃圾邮件的最佳分类器。关键词:分类器,特征选择,垃圾邮件。出版日期:2019年4月30日
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Spam Email Detection on Data Mining: A Review
As we know email is an effective tool for communication and it is the fastest way to send information from one place to another and it saves time and also cost. But the email is affected by attacks which include spam mails. Spam is unwanted email or it is bulk data that is flooding the internet with many duplication of similar message, in an attempt to force the email on people who would not otherwise choose to receive it. To address the growing of spam email on the internet the interest of spam filtering also grow accordingly. In this paper we review various spam detection technics. We are use the technics with feature selection algorithm and without feature selection algorithm and apply all the classifier of data mining tool. In this study we analyze the classifier algorithm using two different data mining tools those are WEKA and TANAGRA. Data mining is the discovery of knowledge from the large database and it is the technique of finding out new patterns in a huge data sets. Both data mining tool use different classification algorithms like K-Nearest Neighbor (K-NN), Naive Bayes (NB) and others. Then finally, the best classifier for email spam is identified based on the accuracy of the algorithm on each data mining tools. Keywords : Classifier, Feature selection, Spam E-mail. DOI : 10.7176/JIEA/9-2-01 Publication date : April 30 th 2019
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信