E-WebGuard:用于精确检测网络攻击的增强型神经架构

IF 4.8 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Luchen Zhou , Wei-Chuen Yau , Y.S. Gan , Sze-Teng Liong
{"title":"E-WebGuard:用于精确检测网络攻击的增强型神经架构","authors":"Luchen Zhou ,&nbsp;Wei-Chuen Yau ,&nbsp;Y.S. Gan ,&nbsp;Sze-Teng Liong","doi":"10.1016/j.cose.2024.104127","DOIUrl":null,"url":null,"abstract":"<div><div>Web applications have become a favored tool for organizations to disseminate vast amounts of information to the public. With the increasing adoption and inherent openness of these applications, there is an observed surge in web-based attacks exploited by adversaries. However, most of the web attack detection works are based on public datasets that are outdated or do not cover a sufficient quantity of web application attacks. Furthermore, most of them are binary detection (i.e., normal or attack) and there is little work on multi-class web attack detection. This highlights the crucial need for automated web attack detection models to bolster web security. In this study, a suite of integrated machine learning and deep learning models is designed to detect web attacks. Specifically, this study employs the Character-level Support Vector Machine (Char-SVM), Character-level Long Short-Term Memory (Char-LSTM), Convolutional Neural Network - SVM (CNN-SVM), and CNN-Bi-LSTM models to differentiate between standard HTTP requests and HTTP-based attacks in both the CSIC 2010 and SR-BH 2020 datasets. Note that the CSIC 2010 dataset involves binary classification, while the SR-BH 2020 dataset involves multi-class classification, specifically with 13 classes. Notably, the input data is first converted to the character level before being fed into any of the proposed model architectures. In the binary classification task, the Char-SVM model with a linear kernel outperforms other models, achieving an accuracy rate of 99.60%. The CNN-Bi-LSTM model closely follows with a 99.41% accuracy, surpassing the performance of the CNN-LSTM model presented in previous research. In the context of multi-class classification, the CNN-Bi-LSTM model demonstrates outstanding performance with a 99.63% accuracy rate. Furthermore, the multi-class classification models, namely Char-LSTM and CNN-Bi-LSTM, achieve validation accuracies above 98%, outperforming the two machine learning-based methods mentioned in the original research.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"148 ","pages":"Article 104127"},"PeriodicalIF":4.8000,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"E-WebGuard: Enhanced neural architectures for precision web attack detection\",\"authors\":\"Luchen Zhou ,&nbsp;Wei-Chuen Yau ,&nbsp;Y.S. Gan ,&nbsp;Sze-Teng Liong\",\"doi\":\"10.1016/j.cose.2024.104127\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Web applications have become a favored tool for organizations to disseminate vast amounts of information to the public. With the increasing adoption and inherent openness of these applications, there is an observed surge in web-based attacks exploited by adversaries. However, most of the web attack detection works are based on public datasets that are outdated or do not cover a sufficient quantity of web application attacks. Furthermore, most of them are binary detection (i.e., normal or attack) and there is little work on multi-class web attack detection. This highlights the crucial need for automated web attack detection models to bolster web security. In this study, a suite of integrated machine learning and deep learning models is designed to detect web attacks. Specifically, this study employs the Character-level Support Vector Machine (Char-SVM), Character-level Long Short-Term Memory (Char-LSTM), Convolutional Neural Network - SVM (CNN-SVM), and CNN-Bi-LSTM models to differentiate between standard HTTP requests and HTTP-based attacks in both the CSIC 2010 and SR-BH 2020 datasets. Note that the CSIC 2010 dataset involves binary classification, while the SR-BH 2020 dataset involves multi-class classification, specifically with 13 classes. Notably, the input data is first converted to the character level before being fed into any of the proposed model architectures. In the binary classification task, the Char-SVM model with a linear kernel outperforms other models, achieving an accuracy rate of 99.60%. The CNN-Bi-LSTM model closely follows with a 99.41% accuracy, surpassing the performance of the CNN-LSTM model presented in previous research. In the context of multi-class classification, the CNN-Bi-LSTM model demonstrates outstanding performance with a 99.63% accuracy rate. Furthermore, the multi-class classification models, namely Char-LSTM and CNN-Bi-LSTM, achieve validation accuracies above 98%, outperforming the two machine learning-based methods mentioned in the original research.</div></div>\",\"PeriodicalId\":51004,\"journal\":{\"name\":\"Computers & Security\",\"volume\":\"148 \",\"pages\":\"Article 104127\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2024-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167404824004322\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404824004322","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

网络应用程序已成为企业向公众传播大量信息的首选工具。随着这些应用的日益普及和固有的开放性,我们观察到被对手利用的基于网络的攻击激增。然而,大多数网络攻击检测工作都是基于过时的公共数据集,或者没有涵盖足够数量的网络应用程序攻击。此外,大多数检测都是二元检测(即正常或攻击),很少有关于多类网络攻击检测的工作。这凸显了对自动网络攻击检测模型的迫切需要,以加强网络安全。本研究设计了一套集成机器学习和深度学习模型来检测网络攻击。具体来说,本研究采用了字符级支持向量机(Char-SVM)、字符级长短期记忆(Char-LSTM)、卷积神经网络-SVM(CNN-SVM)和 CNN-Bi-LSTM 模型来区分 CSIC 2010 和 SR-BH 2020 数据集中的标准 HTTP 请求和基于 HTTP 的攻击。请注意,CSIC 2010 数据集涉及二元分类,而 SR-BH 2020 数据集涉及多类分类,特别是 13 类。值得注意的是,在将输入数据输入到任何建议的模型架构之前,首先要将其转换为字符级。在二元分类任务中,采用线性核的 Char-SVM 模型优于其他模型,准确率达到 99.60%。CNN-Bi-LSTM 模型紧随其后,准确率达到 99.41%,超过了之前研究中 CNN-LSTM 模型的表现。在多类分类方面,CNN-Bi-LSTM 模型表现突出,准确率达到 99.63%。此外,多类分类模型(即 Char-LSTM 和 CNN-Bi-LSTM)的验证准确率超过 98%,优于原始研究中提到的两种基于机器学习的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
E-WebGuard: Enhanced neural architectures for precision web attack detection
Web applications have become a favored tool for organizations to disseminate vast amounts of information to the public. With the increasing adoption and inherent openness of these applications, there is an observed surge in web-based attacks exploited by adversaries. However, most of the web attack detection works are based on public datasets that are outdated or do not cover a sufficient quantity of web application attacks. Furthermore, most of them are binary detection (i.e., normal or attack) and there is little work on multi-class web attack detection. This highlights the crucial need for automated web attack detection models to bolster web security. In this study, a suite of integrated machine learning and deep learning models is designed to detect web attacks. Specifically, this study employs the Character-level Support Vector Machine (Char-SVM), Character-level Long Short-Term Memory (Char-LSTM), Convolutional Neural Network - SVM (CNN-SVM), and CNN-Bi-LSTM models to differentiate between standard HTTP requests and HTTP-based attacks in both the CSIC 2010 and SR-BH 2020 datasets. Note that the CSIC 2010 dataset involves binary classification, while the SR-BH 2020 dataset involves multi-class classification, specifically with 13 classes. Notably, the input data is first converted to the character level before being fed into any of the proposed model architectures. In the binary classification task, the Char-SVM model with a linear kernel outperforms other models, achieving an accuracy rate of 99.60%. The CNN-Bi-LSTM model closely follows with a 99.41% accuracy, surpassing the performance of the CNN-LSTM model presented in previous research. In the context of multi-class classification, the CNN-Bi-LSTM model demonstrates outstanding performance with a 99.63% accuracy rate. Furthermore, the multi-class classification models, namely Char-LSTM and CNN-Bi-LSTM, achieve validation accuracies above 98%, outperforming the two machine learning-based methods mentioned in the original research.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers & Security
Computers & Security 工程技术-计算机:信息系统
CiteScore
12.40
自引率
7.10%
发文量
365
审稿时长
10.7 months
期刊介绍: Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world. Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信