On Effectiveness of Source Code and SSL Based Features for Phishing Website Detection

Roopak S, Athira P. Vijayaraghavan, Tony Thomas
{"title":"On Effectiveness of Source Code and SSL Based Features for Phishing Website Detection","authors":"Roopak S, Athira P. Vijayaraghavan, Tony Thomas","doi":"10.1109/ICATIECE45860.2019.9063824","DOIUrl":null,"url":null,"abstract":"Phishing is a social engineering method to steal user credentials through data entry forms from malicious websites. Currently available anti-malware softwares can only detect black listed phishing websites. Similarity based detection methods such as visual similarity can be easily evaded by making some changes in the textual and visual contents of a phishing site. The phishing behavior of a web page can be identified from its URL, domain and source code based features. However, URL and domain based features can be easily defeated by using black hat SEO techniques. In this paper, we extract the relevant rules based on webpage source code and Secure Socket Layering (SSL) based features from a training dataset using Repeated Incremental Pruning to Produce Error Reduction (RIPPER) algorithm. Further, we check for the presence of these rules in a test dataset. Our implementation results show that the webpage source code based rules can identify phishing websites with an accuracy of 0.92.","PeriodicalId":106496,"journal":{"name":"2019 1st International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 1st International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICATIECE45860.2019.9063824","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Phishing is a social engineering method to steal user credentials through data entry forms from malicious websites. Currently available anti-malware softwares can only detect black listed phishing websites. Similarity based detection methods such as visual similarity can be easily evaded by making some changes in the textual and visual contents of a phishing site. The phishing behavior of a web page can be identified from its URL, domain and source code based features. However, URL and domain based features can be easily defeated by using black hat SEO techniques. In this paper, we extract the relevant rules based on webpage source code and Secure Socket Layering (SSL) based features from a training dataset using Repeated Incremental Pruning to Produce Error Reduction (RIPPER) algorithm. Further, we check for the presence of these rules in a test dataset. Our implementation results show that the webpage source code based rules can identify phishing websites with an accuracy of 0.92.
基于源代码和SSL特征的钓鱼网站检测有效性研究
网络钓鱼是一种通过恶意网站的数据输入表单窃取用户凭证的社会工程方法。目前可用的反恶意软件只能检测黑名单上的钓鱼网站。通过对钓鱼网站的文本和视觉内容进行一些更改,可以很容易地避开基于相似性的检测方法,例如视觉相似性。网页的网络钓鱼行为可以从其URL、域名和基于源代码的特征来识别。然而,基于URL和域名的特性可以通过使用黑帽SEO技术轻松击败。在本文中,我们使用重复增量修剪产生错误减少(RIPPER)算法从训练数据集中提取基于网页源代码和基于安全套接字层(SSL)的特征的相关规则。此外,我们检查测试数据集中是否存在这些规则。我们的实现结果表明,基于网页源代码的规则可以识别网络钓鱼网站,准确率为0.92。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信