Analysis of Machine Learning Algorithms by Developing a Phishing Email and Website Detection Model

2021 IEEE International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS) Pub Date : 2021-12-16 DOI:10.1109/CSITSS54238.2021.9683131

Nimisha Dey, S. Samhitha, Malavika Hariprasad, Anagha Anand, Veena Gadad

{"title":"Analysis of Machine Learning Algorithms by Developing a Phishing Email and Website Detection Model","authors":"Nimisha Dey, S. Samhitha, Malavika Hariprasad, Anagha Anand, Veena Gadad","doi":"10.1109/CSITSS54238.2021.9683131","DOIUrl":null,"url":null,"abstract":"Machine Learning is a key branch of Artificial Intelligence that concentrates on the development of computational algorithms by creating models. It has caught major attention in the technological domain due to its various applications in speech recognition, recommendation engines, computer vision, automated stock trading etc. The model’s performance is dependent on the dataset provided and its accuracy can easily be enhanced by expanding the training dataset. Post Covid-19, it has been observed that phishing websites are appallingly on the rise, especially the phishing attacks. These attacks are caused by cybercriminals using PDF’s, Microsoft office documents and other attachments via emails. This paper focusses on discussion and comparison of different machine learning algorithms that are capable of detecting phishing emails and websites. The experiments have shown that that MultinomialNB attains the highest efficiency of 98.06% for phishing email detection and Decision Tree Classifier offers the maximum efficiency of 95.41% for phishing website detection.","PeriodicalId":252628,"journal":{"name":"2021 IEEE International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSITSS54238.2021.9683131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Machine Learning is a key branch of Artificial Intelligence that concentrates on the development of computational algorithms by creating models. It has caught major attention in the technological domain due to its various applications in speech recognition, recommendation engines, computer vision, automated stock trading etc. The model’s performance is dependent on the dataset provided and its accuracy can easily be enhanced by expanding the training dataset. Post Covid-19, it has been observed that phishing websites are appallingly on the rise, especially the phishing attacks. These attacks are caused by cybercriminals using PDF’s, Microsoft office documents and other attachments via emails. This paper focusses on discussion and comparison of different machine learning algorithms that are capable of detecting phishing emails and websites. The experiments have shown that that MultinomialNB attains the highest efficiency of 98.06% for phishing email detection and Decision Tree Classifier offers the maximum efficiency of 95.41% for phishing website detection.

查看原文本刊更多论文

基于网络钓鱼电子邮件和网站检测模型的机器学习算法分析

机器学习是人工智能的一个关键分支，它专注于通过创建模型来开发计算算法。由于它在语音识别、推荐引擎、计算机视觉、自动股票交易等领域的各种应用，引起了技术领域的广泛关注。模型的性能取决于所提供的数据集，通过扩展训练数据集可以很容易地提高模型的准确性。自2019冠状病毒病以来，人们观察到网络钓鱼网站的数量惊人地增加，尤其是网络钓鱼攻击。这些攻击是由网络罪犯通过电子邮件使用PDF文件、微软office文档和其他附件引起的。本文重点讨论和比较了能够检测网络钓鱼电子邮件和网站的不同机器学习算法。实验结果表明，MultinomialNB对网络钓鱼邮件检测的最高效率为98.06%，决策树分类器对网络钓鱼网站检测的最高效率为95.41%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS)

自引率

0.00%

发文量