{"title":"Use of HOG descriptors in phishing detection","authors":"A. S. Bozkir, E. Sezer","doi":"10.1109/ISDFS.2016.7473534","DOIUrl":null,"url":null,"abstract":"Phishing is a scamming activity which deals with making a visual illusion on computer users by providing fake web pages which mimic their legitimate targets in order to steal valuable digital data such as credit card information or e-mail passwords. In contrast to other anti-phishing attempts this paper proposes to evaluate and solve this problem by leveraging a pure computer vision based method in the concept of web page layout similarity. Proposed approach employs Histogram of Oriented Gradients (HOG) descriptor in order to capture cues of page layout without the need of time consuming intermediate stage of segmentation. Moreover, histogram intersection kernel has been used as a similarity metric for computing similarity. Thus, an efficient and fast phishing page detection scheme has been developed in order to combat with zero-day phishing page attacks. To verify the efficiency of our phishing page detection mechanism, 50 unique phishing pages and their legitimate targets have been collected. Furthermore, 100 pairs of legitimate pages have been gathered. As the next stage, the similarity scores in these two groups were computed and compared. According to promising results, similarity degree around 75% and above can be adequate for alarming.","PeriodicalId":136977,"journal":{"name":"2016 4th International Symposium on Digital Forensic and Security (ISDFS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 4th International Symposium on Digital Forensic and Security (ISDFS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISDFS.2016.7473534","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
Phishing is a scamming activity which deals with making a visual illusion on computer users by providing fake web pages which mimic their legitimate targets in order to steal valuable digital data such as credit card information or e-mail passwords. In contrast to other anti-phishing attempts this paper proposes to evaluate and solve this problem by leveraging a pure computer vision based method in the concept of web page layout similarity. Proposed approach employs Histogram of Oriented Gradients (HOG) descriptor in order to capture cues of page layout without the need of time consuming intermediate stage of segmentation. Moreover, histogram intersection kernel has been used as a similarity metric for computing similarity. Thus, an efficient and fast phishing page detection scheme has been developed in order to combat with zero-day phishing page attacks. To verify the efficiency of our phishing page detection mechanism, 50 unique phishing pages and their legitimate targets have been collected. Furthermore, 100 pairs of legitimate pages have been gathered. As the next stage, the similarity scores in these two groups were computed and compared. According to promising results, similarity degree around 75% and above can be adequate for alarming.
网络钓鱼是一种诈骗活动,通过提供模仿合法目标的虚假网页,使计算机用户产生视觉错觉,以窃取信用卡信息或电子邮件密码等有价值的数字数据。与其他反钓鱼尝试相比,本文提出利用基于网页布局相似度概念的纯计算机视觉方法来评估和解决这个问题。该方法采用了直方图定向梯度描述符(Histogram of Oriented Gradients, HOG)来捕捉页面布局的线索,而不需要耗时的中间分割阶段。此外,直方图交叉核被用作相似度度量来计算相似度。为此,本文开发了一种高效、快速的网络钓鱼页面检测方案,以对抗零日网络钓鱼页面攻击。为了验证我们的网络钓鱼页面检测机制的效率,我们收集了50个独特的网络钓鱼页面及其合法目标。此外,还收集了100对合法页面。作为下一阶段,计算并比较两组的相似度得分。根据有希望的结果,相似度在75%及以上就足够报警了。