{"title":"Automatic phishing website detection and prevention model using transformer deep belief network","authors":"Amol Babaso Majgave , Nitin L. Gavankar","doi":"10.1016/j.cose.2024.104071","DOIUrl":null,"url":null,"abstract":"<div><p>In the digitally connected world cybersecurity is paramount, phishing where attackers pose as trusted entities to steal sensitive data, looms large. The proliferation of phishing attacks on the internet poses a substantial threat to individuals and organizations, compromising sensitive information and causing financial and reputational damage. This study's goal is to establish an automated system for the early detection and prevention of phishing websites, thereby enhancing online security and protecting users from cyber threats. This research initially employs One Hot Encoding (OHE) mechanism-based pre-processing mechanism that converts every URL string into a numerical vector with a particular dimension. This study utilizes two feature selection techniques which are transfer learning-based feature extraction using DarkNet19 and Variational Autoencoder (VAE) to select the value of the most important feature. The robust security mechanisms are presented to prevent phishing attacks and safeguard personal information on websites. List-based deep learning-based systems to prevent and detect phishing URLs more efficiently. The study proposes a transformer-based Deep Belief Network (TB-DBN), a veritable pre-trained deep transformer network model for phishing behaviour detection. A cross-validation technique with grid search hyper-parameter tuning based on the Intelligence Binary Bat Algorithm (IBBA) was designed using the proposed hybrid model. Predictions were made to classify the phishing URLs using a probabilistic estimation guided boosting classifier model and evaluate their performance in terms of accuracy, precision, recall, specificity, and F1- score. The risk level associated with the URL will be assessed based on various factors, such as the source's reputation, content analysis results, and behavioural anomalies. The computational complexity of DL model training is influenced by various factors, such as the model's complexity, the training data's size, and the optimization algorithm exploited, for training. The outcome demonstrates that tweaking variables increases the effectiveness of Python-based deep learning systems. The findings of the proposed method excel, achieving an accuracy of 99.4 %, precision of 99.2 %, recall of 99.3 %, and an F1-score of 99.2 %. This innovative automatic phishing website detection and prevention model, based on a Transformer-based Deep Belief Network, offers advanced accuracy and adaptability, strengthening cybersecurity measures to safeguard sensitive user information and mitigate the substantial threat of phishing attacks in the digitally connected world.</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8000,"publicationDate":"2024-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404824003766","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In the digitally connected world cybersecurity is paramount, phishing where attackers pose as trusted entities to steal sensitive data, looms large. The proliferation of phishing attacks on the internet poses a substantial threat to individuals and organizations, compromising sensitive information and causing financial and reputational damage. This study's goal is to establish an automated system for the early detection and prevention of phishing websites, thereby enhancing online security and protecting users from cyber threats. This research initially employs One Hot Encoding (OHE) mechanism-based pre-processing mechanism that converts every URL string into a numerical vector with a particular dimension. This study utilizes two feature selection techniques which are transfer learning-based feature extraction using DarkNet19 and Variational Autoencoder (VAE) to select the value of the most important feature. The robust security mechanisms are presented to prevent phishing attacks and safeguard personal information on websites. List-based deep learning-based systems to prevent and detect phishing URLs more efficiently. The study proposes a transformer-based Deep Belief Network (TB-DBN), a veritable pre-trained deep transformer network model for phishing behaviour detection. A cross-validation technique with grid search hyper-parameter tuning based on the Intelligence Binary Bat Algorithm (IBBA) was designed using the proposed hybrid model. Predictions were made to classify the phishing URLs using a probabilistic estimation guided boosting classifier model and evaluate their performance in terms of accuracy, precision, recall, specificity, and F1- score. The risk level associated with the URL will be assessed based on various factors, such as the source's reputation, content analysis results, and behavioural anomalies. The computational complexity of DL model training is influenced by various factors, such as the model's complexity, the training data's size, and the optimization algorithm exploited, for training. The outcome demonstrates that tweaking variables increases the effectiveness of Python-based deep learning systems. The findings of the proposed method excel, achieving an accuracy of 99.4 %, precision of 99.2 %, recall of 99.3 %, and an F1-score of 99.2 %. This innovative automatic phishing website detection and prevention model, based on a Transformer-based Deep Belief Network, offers advanced accuracy and adaptability, strengthening cybersecurity measures to safeguard sensitive user information and mitigate the substantial threat of phishing attacks in the digitally connected world.
期刊介绍:
Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world.
Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.