提出了一种利用机器学习技术过滤垃圾邮件的高效算法

Pacific Science Review A: Natural Science and Engineering Pub Date : 2016-07-01 DOI:10.1016/j.psra.2016.09.017

Ali Shafigh Aski , Navid Khalilzadeh Sourati

{"title":"提出了一种利用机器学习技术过滤垃圾邮件的高效算法","authors":"Ali Shafigh Aski , Navid Khalilzadeh Sourati","doi":"10.1016/j.psra.2016.09.017","DOIUrl":null,"url":null,"abstract":"<div><p>Electronic spam is the most troublesome Internet phenomenon challenging large global companies, including AOL, Google, Yahoo and Microsoft. Spam causes various problems that may, in turn, cause economic losses. Spam causes traffic problems and bottlenecks that limit memory space, computing power and speed. Spam causes users to spend time removing it. Various methods have been developed to filter spam, including black list/white list, Bayesian classification algorithms, keyword matching, header information processing, investigation of spam-sending factors and investigation of received mails. This study describes three machine-learning algorithms to filter spam from valid emails with low error rates and high efficiency using a multilayer perceptron model. Several widely used techniques include C4.5 decision tree classifier, multilayer perceptron and Naïve Bayes classifier, all of which are used for training data whether in the form of spam or valid emails. Finally, the results are discussed, and outputs of considered techniques are examined in relation to the proposed model.</p></div>","PeriodicalId":100999,"journal":{"name":"Pacific Science Review A: Natural Science and Engineering","volume":"18 2","pages":"Pages 145-149"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.psra.2016.09.017","citationCount":"53","resultStr":"{\"title\":\"Proposed efficient algorithm to filter spam using machine learning techniques\",\"authors\":\"Ali Shafigh Aski , Navid Khalilzadeh Sourati\",\"doi\":\"10.1016/j.psra.2016.09.017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Electronic spam is the most troublesome Internet phenomenon challenging large global companies, including AOL, Google, Yahoo and Microsoft. Spam causes various problems that may, in turn, cause economic losses. Spam causes traffic problems and bottlenecks that limit memory space, computing power and speed. Spam causes users to spend time removing it. Various methods have been developed to filter spam, including black list/white list, Bayesian classification algorithms, keyword matching, header information processing, investigation of spam-sending factors and investigation of received mails. This study describes three machine-learning algorithms to filter spam from valid emails with low error rates and high efficiency using a multilayer perceptron model. Several widely used techniques include C4.5 decision tree classifier, multilayer perceptron and Naïve Bayes classifier, all of which are used for training data whether in the form of spam or valid emails. Finally, the results are discussed, and outputs of considered techniques are examined in relation to the proposed model.</p></div>\",\"PeriodicalId\":100999,\"journal\":{\"name\":\"Pacific Science Review A: Natural Science and Engineering\",\"volume\":\"18 2\",\"pages\":\"Pages 145-149\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/j.psra.2016.09.017\",\"citationCount\":\"53\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pacific Science Review A: Natural Science and Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2405882316300412\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pacific Science Review A: Natural Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2405882316300412","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 53

摘要

电子垃圾邮件是美国在线(AOL)、谷歌(Google)、雅虎(Yahoo)、微软(Microsoft)等大型跨国公司面临的最棘手的互联网现象。垃圾邮件会引起各种问题，进而可能造成经济损失。垃圾邮件会导致流量问题和瓶颈，从而限制内存空间、计算能力和速度。垃圾邮件导致用户花时间删除。过滤垃圾邮件的方法多种多样，包括黑名单/白名单、贝叶斯分类算法、关键字匹配、标题信息处理、垃圾邮件发送因素调查和接收邮件调查。本研究描述了三种机器学习算法，使用多层感知器模型从有效电子邮件中过滤垃圾邮件，错误率低，效率高。一些广泛使用的技术包括C4.5决策树分类器、多层感知器和Naïve贝叶斯分类器，它们都用于训练数据，无论是垃圾邮件还是有效电子邮件。最后，讨论了结果，并根据所提出的模型检查了所考虑的技术的输出。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Proposed efficient algorithm to filter spam using machine learning techniques

Electronic spam is the most troublesome Internet phenomenon challenging large global companies, including AOL, Google, Yahoo and Microsoft. Spam causes various problems that may, in turn, cause economic losses. Spam causes traffic problems and bottlenecks that limit memory space, computing power and speed. Spam causes users to spend time removing it. Various methods have been developed to filter spam, including black list/white list, Bayesian classification algorithms, keyword matching, header information processing, investigation of spam-sending factors and investigation of received mails. This study describes three machine-learning algorithms to filter spam from valid emails with low error rates and high efficiency using a multilayer perceptron model. Several widely used techniques include C4.5 decision tree classifier, multilayer perceptron and Naïve Bayes classifier, all of which are used for training data whether in the form of spam or valid emails. Finally, the results are discussed, and outputs of considered techniques are examined in relation to the proposed model.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Pacific Science Review A: Natural Science and Engineering

自引率

0.00%

发文量