{"title":"Phishing Site Detection Using Similarity of Website Structure","authors":"Shoma Tanaka, T. Matsunaka, A. Yamada, A. Kubota","doi":"10.1109/DSC49826.2021.9346256","DOIUrl":null,"url":null,"abstract":"The number of phishing sites is increasing and becoming a problem. General phishing sites often have very short lives. Phishers are thought to construct phishing sites using tools such as phishing kits. Phishing sites constructed using the same tools have similar website structures. We propose a new method based on the similarity of website structure information defined by the types and sizes of web resources that make up these websites. Our method can detect phishing sites that is not registered with blocklists or do not have similar URL strings with targeting legitimate sites. In addition, our method can identify phishing sites that differed in appearance but have similar website structures. Our method is particularly effective for detecting phishing sites constructed by the same phishers or using the same tools, as our method identifies structural similarity between websites. We conducted an evaluation to confirm the correctness of our assumption using phishing sites constructed using phishing kits and the PhishTank dataset. We found a large number of phishing sites that were structurally similar to phishing sites constructed using phishing kits. We applied our method to web access logs provided by ordinary Japanese citizens, and detected some unknown phishing sites. We have also examined the possibility of improving our method based on the importance of web resources, determined using the number of occurrences in web access logs.","PeriodicalId":184504,"journal":{"name":"2021 IEEE Conference on Dependable and Secure Computing (DSC)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Conference on Dependable and Secure Computing (DSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSC49826.2021.9346256","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
The number of phishing sites is increasing and becoming a problem. General phishing sites often have very short lives. Phishers are thought to construct phishing sites using tools such as phishing kits. Phishing sites constructed using the same tools have similar website structures. We propose a new method based on the similarity of website structure information defined by the types and sizes of web resources that make up these websites. Our method can detect phishing sites that is not registered with blocklists or do not have similar URL strings with targeting legitimate sites. In addition, our method can identify phishing sites that differed in appearance but have similar website structures. Our method is particularly effective for detecting phishing sites constructed by the same phishers or using the same tools, as our method identifies structural similarity between websites. We conducted an evaluation to confirm the correctness of our assumption using phishing sites constructed using phishing kits and the PhishTank dataset. We found a large number of phishing sites that were structurally similar to phishing sites constructed using phishing kits. We applied our method to web access logs provided by ordinary Japanese citizens, and detected some unknown phishing sites. We have also examined the possibility of improving our method based on the importance of web resources, determined using the number of occurrences in web access logs.