PhishTrim: Fast and adaptive phishing detection based on deep representation learning

Lei Zhang, Peng Zhang
{"title":"PhishTrim: Fast and adaptive phishing detection based on deep representation learning","authors":"Lei Zhang, Peng Zhang","doi":"10.1109/ICWS49710.2020.00030","DOIUrl":null,"url":null,"abstract":"Phishing is a kind of network attack which is famous for stealing users' private information without their knowledge. Although researchers have proposed many phishing detection methods, most methods are computationally expensive and difficult to update their detection rules based on changes in attack patterns. In this paper, we propose PhishTrim, a lightweight phishing URLs detection method based on deep representation learning, which is fast and adaptive. We get the initial embedding representation of the URLs through the Skip-gram pre-training model. Bidirectional Long Short Term Memory (Bi-LSTM) is then used to extract context dependency to further learn the deep representation of URLs. The local n-gram features are extracted using Convolutional Neural Networks (CNN). Experiments show that PhishTrim performs better on large-scale datasets with 99.797% accuracy, and indicate that our method has a certain ability to detect zero-day phishing attacks. We have published our PhishTrim2019 dataset at https://github.com/DataReleased/PhishTrim.","PeriodicalId":338833,"journal":{"name":"2020 IEEE International Conference on Web Services (ICWS)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Web Services (ICWS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWS49710.2020.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Phishing is a kind of network attack which is famous for stealing users' private information without their knowledge. Although researchers have proposed many phishing detection methods, most methods are computationally expensive and difficult to update their detection rules based on changes in attack patterns. In this paper, we propose PhishTrim, a lightweight phishing URLs detection method based on deep representation learning, which is fast and adaptive. We get the initial embedding representation of the URLs through the Skip-gram pre-training model. Bidirectional Long Short Term Memory (Bi-LSTM) is then used to extract context dependency to further learn the deep representation of URLs. The local n-gram features are extracted using Convolutional Neural Networks (CNN). Experiments show that PhishTrim performs better on large-scale datasets with 99.797% accuracy, and indicate that our method has a certain ability to detect zero-day phishing attacks. We have published our PhishTrim2019 dataset at https://github.com/DataReleased/PhishTrim.
PhishTrim:基于深度表示学习的快速自适应网络钓鱼检测
网络钓鱼是一种以在用户不知情的情况下窃取用户隐私信息而闻名的网络攻击。尽管研究人员提出了许多网络钓鱼检测方法,但大多数方法计算成本高,并且难以根据攻击模式的变化更新检测规则。本文提出了一种基于深度表示学习的轻量级网络钓鱼url检测方法PhishTrim,该方法具有快速和自适应的特点。我们通过Skip-gram预训练模型得到url的初始嵌入表示。然后使用双向长短期记忆(Bi-LSTM)提取上下文依赖关系,进一步学习url的深度表示。使用卷积神经网络(CNN)提取局部n-gram特征。实验表明,PhishTrim在大规模数据集上的准确率达到99.797%,具有一定的检测零日网络钓鱼攻击的能力。我们已经在https://github.com/DataReleased/PhishTrim上发布了PhishTrim2019数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信