{"title":"Phisher Fighter:基于URL和词频逆文档频率值的网站钓鱼检测系统","authors":"E. S. Vishva, D. Aju","doi":"10.13052/jcsm2245-1439.1114","DOIUrl":null,"url":null,"abstract":"Fundamentally, phishing is a common cybercrime that is indulged by the intruders or hackers on naive and credible individuals and make them to reveal their unique and sensitive information through fictitious websites. The primary intension of this kind of cybercrime is to gain access to the ad hominem or classified information from the recipients. The obtained data comprises of information that can very well utilized to recognize an individual. The purloined personal or sensitive information is commonly marketed in the online dark market and subsequently these information will be bought by the personal identity brigands. Depending upon the sensitivity and the importance of the stolen information, the price of a single piece of purloined information would vary from few dollars to thousands of dollars. Machine learning (ML) as well as Deep Learning (DL) are powerful methods to analyse and endeavour against these phishing attacks. A machine learning based phishing detection system is proposed to protect the website and users from such attacks. In order to optimize the results in a better way, the TF-IDF (Term Frequency-Inverse Document Frequency) value of webpages is employed within the system. ML methods such as LR (Logistic Regression), RF (Random Forest), SVM (Support Vector Machine), NB (Naive Bayes) and SGD (Stochastic Gradient Descent) are applied for training and testing the obtained dataset. Henceforth, a robust phishing website detection system is developed with 90.68% accuracy.","PeriodicalId":37820,"journal":{"name":"Journal of Cyber Security and Mobility","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Phisher Fighter: Website Phishing Detection System Based on URL and Term Frequency-Inverse Document Frequency Values\",\"authors\":\"E. S. Vishva, D. Aju\",\"doi\":\"10.13052/jcsm2245-1439.1114\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fundamentally, phishing is a common cybercrime that is indulged by the intruders or hackers on naive and credible individuals and make them to reveal their unique and sensitive information through fictitious websites. The primary intension of this kind of cybercrime is to gain access to the ad hominem or classified information from the recipients. The obtained data comprises of information that can very well utilized to recognize an individual. The purloined personal or sensitive information is commonly marketed in the online dark market and subsequently these information will be bought by the personal identity brigands. Depending upon the sensitivity and the importance of the stolen information, the price of a single piece of purloined information would vary from few dollars to thousands of dollars. Machine learning (ML) as well as Deep Learning (DL) are powerful methods to analyse and endeavour against these phishing attacks. A machine learning based phishing detection system is proposed to protect the website and users from such attacks. In order to optimize the results in a better way, the TF-IDF (Term Frequency-Inverse Document Frequency) value of webpages is employed within the system. ML methods such as LR (Logistic Regression), RF (Random Forest), SVM (Support Vector Machine), NB (Naive Bayes) and SGD (Stochastic Gradient Descent) are applied for training and testing the obtained dataset. Henceforth, a robust phishing website detection system is developed with 90.68% accuracy.\",\"PeriodicalId\":37820,\"journal\":{\"name\":\"Journal of Cyber Security and Mobility\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Cyber Security and Mobility\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.13052/jcsm2245-1439.1114\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cyber Security and Mobility","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13052/jcsm2245-1439.1114","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
Phisher Fighter: Website Phishing Detection System Based on URL and Term Frequency-Inverse Document Frequency Values
Fundamentally, phishing is a common cybercrime that is indulged by the intruders or hackers on naive and credible individuals and make them to reveal their unique and sensitive information through fictitious websites. The primary intension of this kind of cybercrime is to gain access to the ad hominem or classified information from the recipients. The obtained data comprises of information that can very well utilized to recognize an individual. The purloined personal or sensitive information is commonly marketed in the online dark market and subsequently these information will be bought by the personal identity brigands. Depending upon the sensitivity and the importance of the stolen information, the price of a single piece of purloined information would vary from few dollars to thousands of dollars. Machine learning (ML) as well as Deep Learning (DL) are powerful methods to analyse and endeavour against these phishing attacks. A machine learning based phishing detection system is proposed to protect the website and users from such attacks. In order to optimize the results in a better way, the TF-IDF (Term Frequency-Inverse Document Frequency) value of webpages is employed within the system. ML methods such as LR (Logistic Regression), RF (Random Forest), SVM (Support Vector Machine), NB (Naive Bayes) and SGD (Stochastic Gradient Descent) are applied for training and testing the obtained dataset. Henceforth, a robust phishing website detection system is developed with 90.68% accuracy.
期刊介绍:
Journal of Cyber Security and Mobility is an international, open-access, peer reviewed journal publishing original research, review/survey, and tutorial papers on all cyber security fields including information, computer & network security, cryptography, digital forensics etc. but also interdisciplinary articles that cover privacy, ethical, legal, economical aspects of cyber security or emerging solutions drawn from other branches of science, for example, nature-inspired. The journal aims at becoming an international source of innovation and an essential reading for IT security professionals around the world by providing an in-depth and holistic view on all security spectrum and solutions ranging from practical to theoretical. Its goal is to bring together researchers and practitioners dealing with the diverse fields of cybersecurity and to cover topics that are equally valuable for professionals as well as for those new in the field from all sectors industry, commerce and academia. This journal covers diverse security issues in cyber space and solutions thereof. As cyber space has moved towards the wireless/mobile world, issues in wireless/mobile communications and those involving mobility aspects will also be published.