Alonica R. Villanueva, Christian Atibagos, Jericko De Guzman, John Carlo Dela Cruz, Menchie M. Rosales, Ryan Francisco
{"title":"Application of Natural Language Processing for Phishing Detection Using Machine and Deep Learning Models","authors":"Alonica R. Villanueva, Christian Atibagos, Jericko De Guzman, John Carlo Dela Cruz, Menchie M. Rosales, Ryan Francisco","doi":"10.1109/ICISS55894.2022.9915037","DOIUrl":null,"url":null,"abstract":"Phishing scams are internet frauds that target people by sending them harmful links. Many victims ranging from individuals to big companies, have suffered numerous losses due to phishing, highlighting the increasing need to effectively detect and prevent a phishing attack as soon as it is received. This paper applied machine and deep learning models to detect phishing attacks by natural language processing of Uniform Resource Locators. Machine learning algorithms such as Logistic Regression and Multi-Naive Bayes were used Uniform Resource Locators for classification of legitimate and phishing. Additionally, Long Term Short Memory, Gated Recurrent Units, and Bidirectional Recurrent Neural Networks were used as Deep Learning models. Two of the used models are Long Term Short Memory and Gated Recurrent Units models possess significantly high training and validation scores with an overall accuracy of 95%. The Bidirectional Recurrent Neural Net using Gated Recurrent Units and Bidirectional Recurrent Neural Net using LTSM shows 97% accuracy. Therefore, using multiple deep learning models to predict whether URLs are phishing or legitimate is a significant assistance in reviewing websites. Further research for other parameters aside from using Uniform Resource Locators with different deep learning models can be used to improve the accuracy of phishing detection.","PeriodicalId":125054,"journal":{"name":"2022 International Conference on ICT for Smart Society (ICISS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on ICT for Smart Society (ICISS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICISS55894.2022.9915037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Phishing scams are internet frauds that target people by sending them harmful links. Many victims ranging from individuals to big companies, have suffered numerous losses due to phishing, highlighting the increasing need to effectively detect and prevent a phishing attack as soon as it is received. This paper applied machine and deep learning models to detect phishing attacks by natural language processing of Uniform Resource Locators. Machine learning algorithms such as Logistic Regression and Multi-Naive Bayes were used Uniform Resource Locators for classification of legitimate and phishing. Additionally, Long Term Short Memory, Gated Recurrent Units, and Bidirectional Recurrent Neural Networks were used as Deep Learning models. Two of the used models are Long Term Short Memory and Gated Recurrent Units models possess significantly high training and validation scores with an overall accuracy of 95%. The Bidirectional Recurrent Neural Net using Gated Recurrent Units and Bidirectional Recurrent Neural Net using LTSM shows 97% accuracy. Therefore, using multiple deep learning models to predict whether URLs are phishing or legitimate is a significant assistance in reviewing websites. Further research for other parameters aside from using Uniform Resource Locators with different deep learning models can be used to improve the accuracy of phishing detection.