Detecting Phishing Emails Using Random Forest and AdaBoost Classifier Model

Fredrick Nthurima, Abraham Mutua, Waithaka Stephen Titus
{"title":"Detecting Phishing Emails Using Random Forest and AdaBoost Classifier Model","authors":"Fredrick Nthurima, Abraham Mutua, Waithaka Stephen Titus","doi":"10.32591/coas.ojit.0602.03123n","DOIUrl":null,"url":null,"abstract":"Phishing attack occurs when a phishing email which is a legitimate-looking email, designed to lure the recipient into believing that it is a genuine email to open and click malicious links embedded into the email. This leads to user reveal sensitive information such as credit card number, usernames or passwords to the attacker thereby gaining entry into the compromised account. Online surveys have put phishing attack as the leading attack for web content mostly targeting financial institutions. According to a survey conducted by Ponemon Institute LLC 2017, the loss due to phishing attack is about $1.5 billion per year. This is a global threat to information security and it’s on the rise due to IoT (Internet of Things) and thus requires a better phishing detection mechanism to mitigate these loses and reputation injury. This research paper explores and reports the use of a combination of machine learning algorithms; Random Forest and AdaBoost and use of more phishing email features in improving the accuracy of phishing detection and prevention. This project will explore the existing phishing methods, investigate the effect of combining two machine learning algorithms to detect and prevent phishing attacks, design and develop a supervised classifier which can detect phishing and prevent phishing emails and test the model with existing data. A dataset consisting of both benign and phishing emails will be used to conduct a supervised learning by the model. Expected accuracy is 99.9%, False Negative (FN) and False Positive (FP) rates of 0.1% and below.","PeriodicalId":210545,"journal":{"name":"Open Journal for Information Technology","volume":"26 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Open Journal for Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32591/coas.ojit.0602.03123n","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Phishing attack occurs when a phishing email which is a legitimate-looking email, designed to lure the recipient into believing that it is a genuine email to open and click malicious links embedded into the email. This leads to user reveal sensitive information such as credit card number, usernames or passwords to the attacker thereby gaining entry into the compromised account. Online surveys have put phishing attack as the leading attack for web content mostly targeting financial institutions. According to a survey conducted by Ponemon Institute LLC 2017, the loss due to phishing attack is about $1.5 billion per year. This is a global threat to information security and it’s on the rise due to IoT (Internet of Things) and thus requires a better phishing detection mechanism to mitigate these loses and reputation injury. This research paper explores and reports the use of a combination of machine learning algorithms; Random Forest and AdaBoost and use of more phishing email features in improving the accuracy of phishing detection and prevention. This project will explore the existing phishing methods, investigate the effect of combining two machine learning algorithms to detect and prevent phishing attacks, design and develop a supervised classifier which can detect phishing and prevent phishing emails and test the model with existing data. A dataset consisting of both benign and phishing emails will be used to conduct a supervised learning by the model. Expected accuracy is 99.9%, False Negative (FN) and False Positive (FP) rates of 0.1% and below.
利用随机森林和AdaBoost分类器模型检测钓鱼邮件
网络钓鱼攻击是指一份看似合法的电子邮件,旨在诱使收件人相信这是一封真正的电子邮件,从而打开并点击嵌入电子邮件中的恶意链接。这会导致用户向攻击者泄露敏感信息,如信用卡号、用户名或密码,从而进入受损帐户。在线调查显示,网络钓鱼攻击是针对网络内容的主要攻击,主要针对金融机构。根据Ponemon Institute LLC 2017年进行的一项调查,网络钓鱼攻击每年造成的损失约为15亿美元。这是对信息安全的全球性威胁,并且由于物联网(IoT)而呈上升趋势,因此需要更好的网络钓鱼检测机制来减轻这些损失和声誉损害。本研究论文探索并报告了机器学习算法组合的使用;随机森林和AdaBoost以及使用更多的网络钓鱼电子邮件功能,以提高网络钓鱼检测和预防的准确性。本项目将探索现有的网络钓鱼方法,研究结合两种机器学习算法检测和预防网络钓鱼攻击的效果,设计和开发一个可以检测网络钓鱼和预防网络钓鱼邮件的监督分类器,并使用现有数据对模型进行测试。由良性和钓鱼电子邮件组成的数据集将用于模型进行监督学习。预期准确率为99.9%,假阴性(FN)和假阳性(FP)率为0.1%及以下。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信