Needle in a Haystack: Tracking Down Elite Phishing Domains in the Wild

K. Tian, Steve T. K. Jan, Hang Hu, D. Yao, G. Wang
{"title":"Needle in a Haystack: Tracking Down Elite Phishing Domains in the Wild","authors":"K. Tian, Steve T. K. Jan, Hang Hu, D. Yao, G. Wang","doi":"10.1145/3278532.3278569","DOIUrl":null,"url":null,"abstract":"Today's phishing websites are constantly evolving to deceive users and evade the detection. In this paper, we perform a measurement study on squatting phishing domains where the websites impersonate trusted entities not only at the page content level but also at the web domain level. To search for squatting phishing pages, we scanned five types of squatting domains over 224 million DNS records and identified 657K domains that are likely impersonating 702 popular brands. Then we build a novel machine learning classifier to detect phishing pages from both the web and mobile pages under the squatting domains. A key novelty is that our classifier is built on a careful measurement of evasive behaviors of phishing pages in practice. We introduce new features from visual analysis and optical character recognition (OCR) to overcome the heavy content obfuscation from attackers. In total, we discovered and verified 1,175 squatting phishing pages. We show that these phishing pages are used for various targeted scams, and are highly effective to evade detection. More than 90% of them successfully evaded popular blacklists for at least a month.","PeriodicalId":20640,"journal":{"name":"Proceedings of the Internet Measurement Conference 2018","volume":"18 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"104","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Internet Measurement Conference 2018","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3278532.3278569","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 104

Abstract

Today's phishing websites are constantly evolving to deceive users and evade the detection. In this paper, we perform a measurement study on squatting phishing domains where the websites impersonate trusted entities not only at the page content level but also at the web domain level. To search for squatting phishing pages, we scanned five types of squatting domains over 224 million DNS records and identified 657K domains that are likely impersonating 702 popular brands. Then we build a novel machine learning classifier to detect phishing pages from both the web and mobile pages under the squatting domains. A key novelty is that our classifier is built on a careful measurement of evasive behaviors of phishing pages in practice. We introduce new features from visual analysis and optical character recognition (OCR) to overcome the heavy content obfuscation from attackers. In total, we discovered and verified 1,175 squatting phishing pages. We show that these phishing pages are used for various targeted scams, and are highly effective to evade detection. More than 90% of them successfully evaded popular blacklists for at least a month.
大海捞针:在野外追踪精英网络钓鱼域名
当今的网络钓鱼网站不断发展,欺骗用户,逃避检测。在本文中,我们对蹲式钓鱼域名进行了测量研究,其中网站不仅在页面内容级别而且在web域名级别冒充可信实体。为了搜索抢注网络钓鱼页面,我们扫描了五种类型的抢注域名,超过2.24亿个DNS记录,并确定了657K个可能冒充702个流行品牌的域名。然后,我们构建了一种新的机器学习分类器来检测来自网页和移动页面的钓鱼页面。一个关键的新颖之处在于,我们的分类器是建立在对网络钓鱼页面规避行为的仔细测量之上的。我们引入了视觉分析和光学字符识别(OCR)的新特性来克服攻击者对内容的严重混淆。我们总共发现并验证了1175个钓鱼页面。我们表明,这些网络钓鱼页面用于各种有针对性的诈骗,并且非常有效地逃避检测。超过90%的人成功地躲过了流行黑名单至少一个月。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信