CrawlPhish: Large-scale Analysis of Client-side Cloaking Techniques in Phishing

2021 IEEE Symposium on Security and Privacy (SP) Pub Date : 2021-05-01 DOI:10.1109/SP40001.2021.00021

Penghui Zhang, Adam Oest, Haehyun Cho, Zhibo Sun, RC Johnson, Brad Wardman, Shaown Sarker, A. Kapravelos, Tiffany Bao, Ruoyu Wang, Yan Shoshitaishvili, Adam Doupé, Gail-Joon Ahn

{"title":"CrawlPhish: Large-scale Analysis of Client-side Cloaking Techniques in Phishing","authors":"Penghui Zhang, Adam Oest, Haehyun Cho, Zhibo Sun, RC Johnson, Brad Wardman, Shaown Sarker, A. Kapravelos, Tiffany Bao, Ruoyu Wang, Yan Shoshitaishvili, Adam Doupé, Gail-Joon Ahn","doi":"10.1109/SP40001.2021.00021","DOIUrl":null,"url":null,"abstract":"Phishing is a critical threat to Internet users. Although an extensive ecosystem serves to protect users, phishing websites are growing in sophistication, and they can slip past the ecosystem’s detection systems—and subsequently cause real-world damage—with the help of evasion techniques. Sophisticated client-side evasion techniques, known as cloaking, leverage JavaScript to enable complex interactions between potential victims and the phishing website, and can thus be particularly effective in slowing or entirely preventing automated mitigations. Yet, neither the prevalence nor the impact of client-side cloaking has been studied.In this paper, we present CrawlPhish, a framework for automatically detecting and categorizing client-side cloaking used by known phishing websites. We deploy CrawlPhish over 14 months between 2018 and 2019 to collect and thoroughly analyze a dataset of 112,005 phishing websites in the wild. By adapting state-of-the-art static and dynamic code analysis, we find that 35,067 of these websites have 1,128 distinct implementations of client-side cloaking techniques. Moreover, we find that attackers’ use of cloaking grew from 23.32% initially to 33.70% by the end of our data collection period. Detection of cloaking by our framework exhibited low false-positive and false-negative rates of 1.45% and 1.75%, respectively. We analyze the semantics of the techniques we detected and propose a taxonomy of eight types of evasion across three high-level categories: User Interaction, Fingerprinting, and Bot Behavior.Using 150 artificial phishing websites, we empirically show that each category of evasion technique is effective in avoiding browser-based phishing detection (a key ecosystem defense). Additionally, through a user study, we verify that the techniques generally do not discourage victim visits. Therefore, we propose ways in which our methodology can be used to not only improve the ecosystem’s ability to mitigate phishing websites with client-side cloaking, but also continuously identify emerging cloaking techniques as they are launched by attackers.","PeriodicalId":6786,"journal":{"name":"2021 IEEE Symposium on Security and Privacy (SP)","volume":"23 1","pages":"1109-1124"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Symposium on Security and Privacy (SP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SP40001.2021.00021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 49

Abstract

Phishing is a critical threat to Internet users. Although an extensive ecosystem serves to protect users, phishing websites are growing in sophistication, and they can slip past the ecosystem’s detection systems—and subsequently cause real-world damage—with the help of evasion techniques. Sophisticated client-side evasion techniques, known as cloaking, leverage JavaScript to enable complex interactions between potential victims and the phishing website, and can thus be particularly effective in slowing or entirely preventing automated mitigations. Yet, neither the prevalence nor the impact of client-side cloaking has been studied.In this paper, we present CrawlPhish, a framework for automatically detecting and categorizing client-side cloaking used by known phishing websites. We deploy CrawlPhish over 14 months between 2018 and 2019 to collect and thoroughly analyze a dataset of 112,005 phishing websites in the wild. By adapting state-of-the-art static and dynamic code analysis, we find that 35,067 of these websites have 1,128 distinct implementations of client-side cloaking techniques. Moreover, we find that attackers’ use of cloaking grew from 23.32% initially to 33.70% by the end of our data collection period. Detection of cloaking by our framework exhibited low false-positive and false-negative rates of 1.45% and 1.75%, respectively. We analyze the semantics of the techniques we detected and propose a taxonomy of eight types of evasion across three high-level categories: User Interaction, Fingerprinting, and Bot Behavior.Using 150 artificial phishing websites, we empirically show that each category of evasion technique is effective in avoiding browser-based phishing detection (a key ecosystem defense). Additionally, through a user study, we verify that the techniques generally do not discourage victim visits. Therefore, we propose ways in which our methodology can be used to not only improve the ecosystem’s ability to mitigate phishing websites with client-side cloaking, but also continuously identify emerging cloaking techniques as they are launched by attackers.

查看原文本刊更多论文

CrawlPhish:网络钓鱼客户端伪装技术的大规模分析

网络钓鱼是对互联网用户的严重威胁。尽管有一个广泛的生态系统来保护用户，但网络钓鱼网站越来越复杂，它们可以通过生态系统的检测系统，并在逃避技术的帮助下造成现实世界的破坏。复杂的客户端规避技术，称为隐形，利用JavaScript在潜在受害者和网络钓鱼网站之间实现复杂的交互，因此可以特别有效地减缓或完全阻止自动缓解。然而，客户端隐形的流行程度和影响都没有被研究过。在本文中，我们提出了CrawlPhish，一个用于自动检测和分类已知网络钓鱼网站使用的客户端伪装的框架。我们在2018年至2019年之间的14个月内部署了CrawlPhish，以收集并彻底分析野外112,005个网络钓鱼网站的数据集。通过采用最先进的静态和动态代码分析，我们发现这些网站中有35,067个有1,128种不同的客户端隐身技术实现。此外，我们发现攻击者使用隐形技术的比例从最初的23.32%增长到数据收集期结束时的33.70%。该检测框架对伪装的假阳性率和假阴性率较低，分别为1.45%和1.75%。我们分析了我们检测到的技术的语义，并提出了跨三个高级类别的八种逃避类型的分类:用户交互、指纹和Bot行为。使用150个人工网络钓鱼网站，我们的经验表明，每种规避技术都有效地避免了基于浏览器的网络钓鱼检测(一种关键的生态系统防御)。此外，通过一项用户研究，我们证实这些技术通常不会阻止受害者的访问。因此，我们提出了一些方法，其中我们的方法不仅可以用来提高生态系统的能力，以减轻客户端伪装的网络钓鱼网站，而且还可以在攻击者发起时不断识别新兴的伪装技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE Symposium on Security and Privacy (SP)

自引率

0.00%

发文量