Detecting SQL Injection Web Attacks Using Ensemble Learners and Data Sampling

R. Zuech, John T. Hancock, T. Khoshgoftaar
{"title":"Detecting SQL Injection Web Attacks Using Ensemble Learners and Data Sampling","authors":"R. Zuech, John T. Hancock, T. Khoshgoftaar","doi":"10.1109/CSR51186.2021.9527990","DOIUrl":null,"url":null,"abstract":"SQL Injection web attacks are a common choice among attackers to exploit web servers. We explore classification performance in detecting SQL Injection web attacks in the recent CSE-CIC-IDS2018 dataset with the Area Under the Receiver Operating Characteristic Curve (AUC) metric for the following seven classifiers: Random Forest (RF), CatBoost (CB), LightGBM (LGB), XGBoost (XGB), Decision Tree (DT), Naive Bayes (NB), and Logistic Regression (LR) (with the first four learners being ensemble learners and for comparison, the last three being single learners). Our unique data preparation of CSE-CID- IDS2018 affords a harsh experimental testbed of class imbalance as encountered in the real world for cybersecurity attacks. To the best of our knowledge, we are the first to apply random undersampling techniques to web attacks from the CSE-CIC- IDS2018 dataset while exploring various sampling ratios. We find the ensemble learners to be the most effective at detecting SQL Injection web attacks, but only after first applying massive data sampling.","PeriodicalId":253300,"journal":{"name":"2021 IEEE International Conference on Cyber Security and Resilience (CSR)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Cyber Security and Resilience (CSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSR51186.2021.9527990","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

SQL Injection web attacks are a common choice among attackers to exploit web servers. We explore classification performance in detecting SQL Injection web attacks in the recent CSE-CIC-IDS2018 dataset with the Area Under the Receiver Operating Characteristic Curve (AUC) metric for the following seven classifiers: Random Forest (RF), CatBoost (CB), LightGBM (LGB), XGBoost (XGB), Decision Tree (DT), Naive Bayes (NB), and Logistic Regression (LR) (with the first four learners being ensemble learners and for comparison, the last three being single learners). Our unique data preparation of CSE-CID- IDS2018 affords a harsh experimental testbed of class imbalance as encountered in the real world for cybersecurity attacks. To the best of our knowledge, we are the first to apply random undersampling techniques to web attacks from the CSE-CIC- IDS2018 dataset while exploring various sampling ratios. We find the ensemble learners to be the most effective at detecting SQL Injection web attacks, but only after first applying massive data sampling.
基于集成学习器和数据采样的SQL注入Web攻击检测
SQL注入web攻击是攻击者利用web服务器的常见选择。在最近的CSE-CIC-IDS2018数据集中,我们使用接收器工作特征曲线下的面积(AUC)度量来探索以下七个分类器在检测SQL注入web攻击方面的分类性能:随机森林(RF)、CatBoost (CB)、LightGBM (LGB)、XGBoost (XGB)、决策树(DT)、朴素贝叶斯(NB)和逻辑回归(LR)(前四个学习器是集成学习器,为了比较,后三个是单个学习器)。我们独特的CSE-CID- IDS2018数据准备为网络安全攻击在现实世界中遇到的阶级不平衡提供了苛刻的实验测试平台。据我们所知,我们是第一个将随机欠采样技术应用于CSE-CIC- IDS2018数据集的web攻击,同时探索各种采样比率。我们发现集成学习器在检测SQL注入web攻击方面是最有效的,但只有在首先应用大量数据采样之后。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信