{"title":"PhishTrim: Fast and adaptive phishing detection based on deep representation learning","authors":"Lei Zhang, Peng Zhang","doi":"10.1109/ICWS49710.2020.00030","DOIUrl":null,"url":null,"abstract":"Phishing is a kind of network attack which is famous for stealing users' private information without their knowledge. Although researchers have proposed many phishing detection methods, most methods are computationally expensive and difficult to update their detection rules based on changes in attack patterns. In this paper, we propose PhishTrim, a lightweight phishing URLs detection method based on deep representation learning, which is fast and adaptive. We get the initial embedding representation of the URLs through the Skip-gram pre-training model. Bidirectional Long Short Term Memory (Bi-LSTM) is then used to extract context dependency to further learn the deep representation of URLs. The local n-gram features are extracted using Convolutional Neural Networks (CNN). Experiments show that PhishTrim performs better on large-scale datasets with 99.797% accuracy, and indicate that our method has a certain ability to detect zero-day phishing attacks. We have published our PhishTrim2019 dataset at https://github.com/DataReleased/PhishTrim.","PeriodicalId":338833,"journal":{"name":"2020 IEEE International Conference on Web Services (ICWS)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Web Services (ICWS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWS49710.2020.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Phishing is a kind of network attack which is famous for stealing users' private information without their knowledge. Although researchers have proposed many phishing detection methods, most methods are computationally expensive and difficult to update their detection rules based on changes in attack patterns. In this paper, we propose PhishTrim, a lightweight phishing URLs detection method based on deep representation learning, which is fast and adaptive. We get the initial embedding representation of the URLs through the Skip-gram pre-training model. Bidirectional Long Short Term Memory (Bi-LSTM) is then used to extract context dependency to further learn the deep representation of URLs. The local n-gram features are extracted using Convolutional Neural Networks (CNN). Experiments show that PhishTrim performs better on large-scale datasets with 99.797% accuracy, and indicate that our method has a certain ability to detect zero-day phishing attacks. We have published our PhishTrim2019 dataset at https://github.com/DataReleased/PhishTrim.