基于深度强化学习的不平衡网络钓鱼分类

Comput. Pub Date : 2023-06-09 DOI:10.3390/computers12060118
Antonio Maci, Alessandro Santorsola, Anthony J. Coscia, Andrea Iannacone
{"title":"基于深度强化学习的不平衡网络钓鱼分类","authors":"Antonio Maci, Alessandro Santorsola, Anthony J. Coscia, Andrea Iannacone","doi":"10.3390/computers12060118","DOIUrl":null,"url":null,"abstract":"Web phishing is a form of cybercrime aimed at tricking people into visiting malicious URLs to exfiltrate sensitive data. Since the structure of a malicious URL evolves over time, phishing detection mechanisms that can adapt to such variations are paramount. Furthermore, web phishing detection is an unbalanced classification task, as legitimate URLs outnumber malicious ones in real-life cases. Deep learning (DL) has emerged as a promising technique to minimize concept drift to enhance web phishing detection. Deep reinforcement learning (DRL) combines DL with reinforcement learning (RL); that is, a sequential decision-making paradigm in which the problem to be addressed is expressed as a Markov decision process (MDP). Recent studies have proposed an ad hoc MDP formulation to tackle unbalanced classification tasks called the imbalanced classification Markov decision process (ICMDP). In this paper, we exploit the ICMDP to present a double deep Q-Network (DDQN)-based classifier to address the unbalanced web phishing classification problem. The proposed algorithm is evaluated on a Mendeley web phishing dataset, from which three different data imbalance scenarios are generated. Despite a significant training time, it results in better geometric mean, index of balanced accuracy, F1 score, and area under the ROC curve than other DL-based classifiers combined with data-level sampling techniques in all test cases.","PeriodicalId":10526,"journal":{"name":"Comput.","volume":"99 1","pages":"118"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unbalanced Web Phishing Classification through Deep Reinforcement Learning\",\"authors\":\"Antonio Maci, Alessandro Santorsola, Anthony J. Coscia, Andrea Iannacone\",\"doi\":\"10.3390/computers12060118\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Web phishing is a form of cybercrime aimed at tricking people into visiting malicious URLs to exfiltrate sensitive data. Since the structure of a malicious URL evolves over time, phishing detection mechanisms that can adapt to such variations are paramount. Furthermore, web phishing detection is an unbalanced classification task, as legitimate URLs outnumber malicious ones in real-life cases. Deep learning (DL) has emerged as a promising technique to minimize concept drift to enhance web phishing detection. Deep reinforcement learning (DRL) combines DL with reinforcement learning (RL); that is, a sequential decision-making paradigm in which the problem to be addressed is expressed as a Markov decision process (MDP). Recent studies have proposed an ad hoc MDP formulation to tackle unbalanced classification tasks called the imbalanced classification Markov decision process (ICMDP). In this paper, we exploit the ICMDP to present a double deep Q-Network (DDQN)-based classifier to address the unbalanced web phishing classification problem. The proposed algorithm is evaluated on a Mendeley web phishing dataset, from which three different data imbalance scenarios are generated. Despite a significant training time, it results in better geometric mean, index of balanced accuracy, F1 score, and area under the ROC curve than other DL-based classifiers combined with data-level sampling techniques in all test cases.\",\"PeriodicalId\":10526,\"journal\":{\"name\":\"Comput.\",\"volume\":\"99 1\",\"pages\":\"118\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Comput.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/computers12060118\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/computers12060118","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

网络钓鱼是一种旨在诱骗人们访问恶意url以窃取敏感数据的网络犯罪形式。由于恶意URL的结构会随着时间的推移而演变,因此能够适应这种变化的网络钓鱼检测机制至关重要。此外,网络钓鱼检测是一项不平衡的分类任务,因为在现实生活中,合法的url多于恶意的url。深度学习(DL)已成为一种有前途的技术,以减少概念漂移,提高网络钓鱼检测。深度强化学习(DRL)将深度学习与强化学习(RL)相结合;也就是说,一种顺序决策范式,其中要解决的问题被表示为马尔可夫决策过程(MDP)。最近的研究提出了一种特殊的MDP公式来解决不平衡分类任务,称为不平衡分类马尔可夫决策过程(ICMDP)。在本文中,我们利用ICMDP提出了一个基于双深度Q-Network (DDQN)的分类器来解决不平衡的网络钓鱼分类问题。在Mendeley网络钓鱼数据集上对该算法进行了评估,并从中生成了三种不同的数据不平衡场景。尽管训练时间很长,但在所有测试用例中,它比其他基于dl的分类器结合数据级采样技术在几何均值、平衡精度指数、F1分数和ROC曲线下面积方面都有更好的表现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Unbalanced Web Phishing Classification through Deep Reinforcement Learning
Web phishing is a form of cybercrime aimed at tricking people into visiting malicious URLs to exfiltrate sensitive data. Since the structure of a malicious URL evolves over time, phishing detection mechanisms that can adapt to such variations are paramount. Furthermore, web phishing detection is an unbalanced classification task, as legitimate URLs outnumber malicious ones in real-life cases. Deep learning (DL) has emerged as a promising technique to minimize concept drift to enhance web phishing detection. Deep reinforcement learning (DRL) combines DL with reinforcement learning (RL); that is, a sequential decision-making paradigm in which the problem to be addressed is expressed as a Markov decision process (MDP). Recent studies have proposed an ad hoc MDP formulation to tackle unbalanced classification tasks called the imbalanced classification Markov decision process (ICMDP). In this paper, we exploit the ICMDP to present a double deep Q-Network (DDQN)-based classifier to address the unbalanced web phishing classification problem. The proposed algorithm is evaluated on a Mendeley web phishing dataset, from which three different data imbalance scenarios are generated. Despite a significant training time, it results in better geometric mean, index of balanced accuracy, F1 score, and area under the ROC curve than other DL-based classifiers combined with data-level sampling techniques in all test cases.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信