HCNN-LSTM: Hybrid Convolutional Neural Network with Long Short-Term Memory Integrated for Legitimate Web Prediction

IF 0.7 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Web Engineering Pub Date : 2023-07-01 DOI:10.13052/jwe1540-9589.2251

Candra Zonyfar;Jung-Been Lee;Jeong-Dong Kim

{"title":"HCNN-LSTM: Hybrid Convolutional Neural Network with Long Short-Term Memory Integrated for Legitimate Web Prediction","authors":"Candra Zonyfar;Jung-Been Lee;Jeong-Dong Kim","doi":"10.13052/jwe1540-9589.2251","DOIUrl":null,"url":null,"abstract":"Phishing techniques are the most frequently used threat by attackers to deceive Internet users and obtain sensitive victim information, such as login credentials and credit card numbers. So, it is important for users to know the legitimate website to avoid the traps of fake websites. However, it is difficult for lay users to distinguish legitimate websites, considering that phishing techniques are always developing from time to time. Therefore, a legitimate website detection system is an easy way for users to avoid phishing websites. To address this problem, we present a hybrid deep learning model by combining a convolution neural network and long short-term memory (HCNN-LSTM). A one-dimensional CNN with a LSTM network shared estimation of all sublayers, then implements the proposed model in the benchmark dataset for phishing prediction, which consists of 11430 URLs with 87 attributes extracted of which 56 parameters are selected from URL structure and syntax. The HCNN-LSTM model was successful in binary classification with accuracy, precision, recall, and F1-score of 95.19%, 95.00%, 95.00%, 95.00%, successively outperforming the CNN and LSTM. Thus, the results show that our proposed model is a competitive new model for the legitimate web prediction tasks.","PeriodicalId":49952,"journal":{"name":"Journal of Web Engineering","volume":"22 5","pages":"757-782"},"PeriodicalIF":0.7000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10374423","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Web Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10374423/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Phishing techniques are the most frequently used threat by attackers to deceive Internet users and obtain sensitive victim information, such as login credentials and credit card numbers. So, it is important for users to know the legitimate website to avoid the traps of fake websites. However, it is difficult for lay users to distinguish legitimate websites, considering that phishing techniques are always developing from time to time. Therefore, a legitimate website detection system is an easy way for users to avoid phishing websites. To address this problem, we present a hybrid deep learning model by combining a convolution neural network and long short-term memory (HCNN-LSTM). A one-dimensional CNN with a LSTM network shared estimation of all sublayers, then implements the proposed model in the benchmark dataset for phishing prediction, which consists of 11430 URLs with 87 attributes extracted of which 56 parameters are selected from URL structure and syntax. The HCNN-LSTM model was successful in binary classification with accuracy, precision, recall, and F1-score of 95.19%, 95.00%, 95.00%, 95.00%, successively outperforming the CNN and LSTM. Thus, the results show that our proposed model is a competitive new model for the legitimate web prediction tasks.

查看原文本刊更多论文

HCNN-LSTM：集成了长短期记忆的混合卷积神经网络，用于合法网络预测

网络钓鱼技术是攻击者最常使用的威胁，他们利用这种技术欺骗互联网用户，获取受害者的敏感信息，如登录凭证和信用卡号。因此，用户必须了解合法网站，避开虚假网站的陷阱。然而，由于网络钓鱼技术不断发展，普通用户很难辨别合法网站。因此，合法网站检测系统是用户避开钓鱼网站的便捷途径。为解决这一问题，我们提出了一种结合卷积神经网络和长短期记忆（HCNN-LSTM）的混合深度学习模型。一维 CNN 与 LSTM 网络共享对所有子层的估计，然后在钓鱼网站预测基准数据集中实现所提出的模型，该数据集由 11430 个 URL 组成，提取了 87 个属性，其中 56 个参数是从 URL 结构和语法中选取的。HCNN-LSTM 模型在二元分类中取得了成功，准确率、精确率、召回率和 F1 分数分别为 95.19%、95.00%、95.00%、95.00%，相继优于 CNN 和 LSTM。因此，这些结果表明，我们提出的模型在合法网络预测任务中是一种有竞争力的新模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Web Engineering 工程技术-计算机：理论方法

CiteScore

1.80

自引率

12.50%

发文量

审稿时长

9 months

期刊介绍： The World Wide Web and its associated technologies have become a major implementation and delivery platform for a large variety of applications, ranging from simple institutional information Web sites to sophisticated supply-chain management systems, financial applications, e-government, distance learning, and entertainment, among others. Such applications, in addition to their intrinsic functionality, also exhibit the more complex behavior of distributed applications.