Phisher Fighter：基于URL和词频逆文档频率值的网站钓鱼检测系统

Q3 Computer Science

Journal of Cyber Security and Mobility Pub Date : 2021-11-20 DOI:10.13052/jcsm2245-1439.1114

E. S. Vishva, D. Aju

{"title":"Phisher Fighter：基于URL和词频逆文档频率值的网站钓鱼检测系统","authors":"E. S. Vishva, D. Aju","doi":"10.13052/jcsm2245-1439.1114","DOIUrl":null,"url":null,"abstract":"Fundamentally, phishing is a common cybercrime that is indulged by the intruders or hackers on naive and credible individuals and make them to reveal their unique and sensitive information through fictitious websites. The primary intension of this kind of cybercrime is to gain access to the ad hominem or classified information from the recipients. The obtained data comprises of information that can very well utilized to recognize an individual. The purloined personal or sensitive information is commonly marketed in the online dark market and subsequently these information will be bought by the personal identity brigands. Depending upon the sensitivity and the importance of the stolen information, the price of a single piece of purloined information would vary from few dollars to thousands of dollars. Machine learning (ML) as well as Deep Learning (DL) are powerful methods to analyse and endeavour against these phishing attacks. A machine learning based phishing detection system is proposed to protect the website and users from such attacks. In order to optimize the results in a better way, the TF-IDF (Term Frequency-Inverse Document Frequency) value of webpages is employed within the system. ML methods such as LR (Logistic Regression), RF (Random Forest), SVM (Support Vector Machine), NB (Naive Bayes) and SGD (Stochastic Gradient Descent) are applied for training and testing the obtained dataset. Henceforth, a robust phishing website detection system is developed with 90.68% accuracy.","PeriodicalId":37820,"journal":{"name":"Journal of Cyber Security and Mobility","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Phisher Fighter: Website Phishing Detection System Based on URL and Term Frequency-Inverse Document Frequency Values\",\"authors\":\"E. S. Vishva, D. Aju\",\"doi\":\"10.13052/jcsm2245-1439.1114\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fundamentally, phishing is a common cybercrime that is indulged by the intruders or hackers on naive and credible individuals and make them to reveal their unique and sensitive information through fictitious websites. The primary intension of this kind of cybercrime is to gain access to the ad hominem or classified information from the recipients. The obtained data comprises of information that can very well utilized to recognize an individual. The purloined personal or sensitive information is commonly marketed in the online dark market and subsequently these information will be bought by the personal identity brigands. Depending upon the sensitivity and the importance of the stolen information, the price of a single piece of purloined information would vary from few dollars to thousands of dollars. Machine learning (ML) as well as Deep Learning (DL) are powerful methods to analyse and endeavour against these phishing attacks. A machine learning based phishing detection system is proposed to protect the website and users from such attacks. In order to optimize the results in a better way, the TF-IDF (Term Frequency-Inverse Document Frequency) value of webpages is employed within the system. ML methods such as LR (Logistic Regression), RF (Random Forest), SVM (Support Vector Machine), NB (Naive Bayes) and SGD (Stochastic Gradient Descent) are applied for training and testing the obtained dataset. Henceforth, a robust phishing website detection system is developed with 90.68% accuracy.\",\"PeriodicalId\":37820,\"journal\":{\"name\":\"Journal of Cyber Security and Mobility\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Cyber Security and Mobility\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.13052/jcsm2245-1439.1114\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cyber Security and Mobility","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13052/jcsm2245-1439.1114","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 3

摘要

从根本上说，网络钓鱼是一种常见的网络犯罪，入侵者或黑客沉迷于天真可信的个人，并使他们通过虚构的网站泄露自己的独特和敏感信息。这类网络犯罪的主要意图是从接收者那里获取个人信息或机密信息。所获得的数据包括可以很好地用于识别个体的信息。被窃取的个人或敏感信息通常在网上黑市上进行营销，随后这些信息将被个人身份窃贼购买。根据被盗信息的敏感性和重要性，一条被盗信息的价格从几美元到数千美元不等。机器学习（ML）和深度学习（DL）是分析和抵御这些网络钓鱼攻击的强大方法。为了保护网站和用户免受此类攻击，提出了一种基于机器学习的网络钓鱼检测系统。为了更好地优化结果，系统中采用了网页的TF-IDF（术语频率逆文档频率）值。应用LR（Logistic Regression）、RF（Random Forest）、SVM（Support Vector Machine）、NB（Naive Bayes）和SGD（Random Gradient Descent）等ML方法对所获得的数据集进行训练和测试。此后，开发了一个强大的钓鱼网站检测系统，准确率为90.68%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Phisher Fighter: Website Phishing Detection System Based on URL and Term Frequency-Inverse Document Frequency Values

Fundamentally, phishing is a common cybercrime that is indulged by the intruders or hackers on naive and credible individuals and make them to reveal their unique and sensitive information through fictitious websites. The primary intension of this kind of cybercrime is to gain access to the ad hominem or classified information from the recipients. The obtained data comprises of information that can very well utilized to recognize an individual. The purloined personal or sensitive information is commonly marketed in the online dark market and subsequently these information will be bought by the personal identity brigands. Depending upon the sensitivity and the importance of the stolen information, the price of a single piece of purloined information would vary from few dollars to thousands of dollars. Machine learning (ML) as well as Deep Learning (DL) are powerful methods to analyse and endeavour against these phishing attacks. A machine learning based phishing detection system is proposed to protect the website and users from such attacks. In order to optimize the results in a better way, the TF-IDF (Term Frequency-Inverse Document Frequency) value of webpages is employed within the system. ML methods such as LR (Logistic Regression), RF (Random Forest), SVM (Support Vector Machine), NB (Naive Bayes) and SGD (Stochastic Gradient Descent) are applied for training and testing the obtained dataset. Henceforth, a robust phishing website detection system is developed with 90.68% accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Cyber Security and Mobility Computer Science-Computer Networks and Communications

CiteScore

2.30

自引率

0.00%

发文量

期刊介绍： Journal of Cyber Security and Mobility is an international, open-access, peer reviewed journal publishing original research, review/survey, and tutorial papers on all cyber security fields including information, computer & network security, cryptography, digital forensics etc. but also interdisciplinary articles that cover privacy, ethical, legal, economical aspects of cyber security or emerging solutions drawn from other branches of science, for example, nature-inspired. The journal aims at becoming an international source of innovation and an essential reading for IT security professionals around the world by providing an in-depth and holistic view on all security spectrum and solutions ranging from practical to theoretical. Its goal is to bring together researchers and practitioners dealing with the diverse fields of cybersecurity and to cover topics that are equally valuable for professionals as well as for those new in the field from all sectors industry, commerce and academia. This journal covers diverse security issues in cyber space and solutions thereof. As cyber space has moved towards the wireless/mobile world, issues in wireless/mobile communications and those involving mobility aspects will also be published.