PhishTrim: Fast and adaptive phishing detection based on deep representation learning

2020 IEEE International Conference on Web Services (ICWS) Pub Date : 2020-10-01 DOI:10.1109/ICWS49710.2020.00030

Lei Zhang, Peng Zhang

引用次数: 7

Abstract

Phishing is a kind of network attack which is famous for stealing users' private information without their knowledge. Although researchers have proposed many phishing detection methods, most methods are computationally expensive and difficult to update their detection rules based on changes in attack patterns. In this paper, we propose PhishTrim, a lightweight phishing URLs detection method based on deep representation learning, which is fast and adaptive. We get the initial embedding representation of the URLs through the Skip-gram pre-training model. Bidirectional Long Short Term Memory (Bi-LSTM) is then used to extract context dependency to further learn the deep representation of URLs. The local n-gram features are extracted using Convolutional Neural Networks (CNN). Experiments show that PhishTrim performs better on large-scale datasets with 99.797% accuracy, and indicate that our method has a certain ability to detect zero-day phishing attacks. We have published our PhishTrim2019 dataset at https://github.com/DataReleased/PhishTrim.

查看原文本刊更多论文

PhishTrim:基于深度表示学习的快速自适应网络钓鱼检测

网络钓鱼是一种以在用户不知情的情况下窃取用户隐私信息而闻名的网络攻击。尽管研究人员提出了许多网络钓鱼检测方法，但大多数方法计算成本高，并且难以根据攻击模式的变化更新检测规则。本文提出了一种基于深度表示学习的轻量级网络钓鱼url检测方法PhishTrim，该方法具有快速和自适应的特点。我们通过Skip-gram预训练模型得到url的初始嵌入表示。然后使用双向长短期记忆(Bi-LSTM)提取上下文依赖关系，进一步学习url的深度表示。使用卷积神经网络(CNN)提取局部n-gram特征。实验表明，PhishTrim在大规模数据集上的准确率达到99.797%，具有一定的检测零日网络钓鱼攻击的能力。我们已经在https://github.com/DataReleased/PhishTrim上发布了PhishTrim2019数据集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE International Conference on Web Services (ICWS)

自引率

0.00%

发文量