利用基于混合深度学习的 IDS 框架削弱多数攻击类别偏差

IF 7.7 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Network and Computer Applications Pub Date : 2024-07-03 DOI:10.1016/j.jnca.2024.103954

K.G. Raghavendra Narayan , Rakesh Ganesula , Tamminaina Sai Somasekhar , Srijanee Mookherji , Vanga Odelu , Rajendra Prasath , Alavalapati Goutham Reddy

{"title":"利用基于混合深度学习的 IDS 框架削弱多数攻击类别偏差","authors":"K.G. Raghavendra Narayan , Rakesh Ganesula , Tamminaina Sai Somasekhar , Srijanee Mookherji , Vanga Odelu , Rajendra Prasath , Alavalapati Goutham Reddy","doi":"10.1016/j.jnca.2024.103954","DOIUrl":null,"url":null,"abstract":"<div><p>In real-time application domains, like finance, healthcare and defence, delay in service or stealing information may lead to unrecoverable consequences. So, early detection of intrusion is important to prevent security breaches. In recent days, anomaly-based intrusion detection using Hybrid Deep Learning approaches are becoming more popular. The most used benchmark datasets in the literature are NSL-KDD and UNSW-NB15, and these datasets are imbalanced. The models built on imbalanced datasets may lead to biased results towards majority classes by neglecting the minority class, even though they are equally important. In many cases, high accuracy is achieved for majority classes in the imbalanced datasets. But, the class-level performances are poor with respect to the minority class. The class balancing will also play an important role in attenuating the bias in prediction for imbalanced datasets. In this paper, a Hybrid Deep Learning Based Intrusion Detection (HDLBID) framework is proposed with CNN-BiLSTM combination. The four techniques, namely, Random Oversampling (ROS), ADASYN, SMOTE, and SMOTE-Tomek, are used for class balancing in the proposed HDLBID framework. The proposed HDLBID with SMOTE-Tomek achieves an overall accuracy of 99.6% with NSL-KDD and 89.02% for UNSW-NB15. It results in an improvement of 13.67% for NSL-KDD and 10.62% for UNSW-NB15 over the existing recent related models. In the proposed HDLBID, in addition to overall accuracy, the class-level <span><math><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span> score is also calculated. A comparative study is presented to show the effectiveness of balancing dataset compared to imbalanced dataset, and observed that the SMOTE-Tomek class balancing comparatively performed well. An improvement of 37.43% is observed in the U2R class of the NSL-KDD dataset and 61.65% improvement is seen in the Worms class of the UNSW-NB15 dataset, both with SMOTE-Tomek class balancing. Therefore, the proposed HDLBID with SMOTE-Tomek class balancing reports the best results in terms of overall accuracy compared to existing recent related approaches. Also, in terms of class-level analysis, HDLBID reports best results with SMOTE-Tomek over imbalanced version of datasets.</p></div>","PeriodicalId":54784,"journal":{"name":"Journal of Network and Computer Applications","volume":"230 ","pages":"Article 103954"},"PeriodicalIF":7.7000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Attenuating majority attack class bias using hybrid deep learning based IDS framework\",\"authors\":\"K.G. Raghavendra Narayan , Rakesh Ganesula , Tamminaina Sai Somasekhar , Srijanee Mookherji , Vanga Odelu , Rajendra Prasath , Alavalapati Goutham Reddy\",\"doi\":\"10.1016/j.jnca.2024.103954\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In real-time application domains, like finance, healthcare and defence, delay in service or stealing information may lead to unrecoverable consequences. So, early detection of intrusion is important to prevent security breaches. In recent days, anomaly-based intrusion detection using Hybrid Deep Learning approaches are becoming more popular. The most used benchmark datasets in the literature are NSL-KDD and UNSW-NB15, and these datasets are imbalanced. The models built on imbalanced datasets may lead to biased results towards majority classes by neglecting the minority class, even though they are equally important. In many cases, high accuracy is achieved for majority classes in the imbalanced datasets. But, the class-level performances are poor with respect to the minority class. The class balancing will also play an important role in attenuating the bias in prediction for imbalanced datasets. In this paper, a Hybrid Deep Learning Based Intrusion Detection (HDLBID) framework is proposed with CNN-BiLSTM combination. The four techniques, namely, Random Oversampling (ROS), ADASYN, SMOTE, and SMOTE-Tomek, are used for class balancing in the proposed HDLBID framework. The proposed HDLBID with SMOTE-Tomek achieves an overall accuracy of 99.6% with NSL-KDD and 89.02% for UNSW-NB15. It results in an improvement of 13.67% for NSL-KDD and 10.62% for UNSW-NB15 over the existing recent related models. In the proposed HDLBID, in addition to overall accuracy, the class-level <span><math><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span> score is also calculated. A comparative study is presented to show the effectiveness of balancing dataset compared to imbalanced dataset, and observed that the SMOTE-Tomek class balancing comparatively performed well. An improvement of 37.43% is observed in the U2R class of the NSL-KDD dataset and 61.65% improvement is seen in the Worms class of the UNSW-NB15 dataset, both with SMOTE-Tomek class balancing. Therefore, the proposed HDLBID with SMOTE-Tomek class balancing reports the best results in terms of overall accuracy compared to existing recent related approaches. Also, in terms of class-level analysis, HDLBID reports best results with SMOTE-Tomek over imbalanced version of datasets.</p></div>\",\"PeriodicalId\":54784,\"journal\":{\"name\":\"Journal of Network and Computer Applications\",\"volume\":\"230 \",\"pages\":\"Article 103954\"},\"PeriodicalIF\":7.7000,\"publicationDate\":\"2024-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Network and Computer Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1084804524001310\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Network and Computer Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1084804524001310","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

在金融、医疗保健和国防等实时应用领域，服务延迟或信息被盗可能会导致无法挽回的后果。因此，早期检测入侵对于防止安全漏洞非常重要。近年来，使用混合深度学习方法进行基于异常的入侵检测正变得越来越流行。文献中使用最多的基准数据集是 NSL-KDD 和 UNSW-NB15，这些数据集是不平衡的。在不平衡数据集上建立的模型可能会导致结果偏向多数类，而忽略少数类，即使它们同样重要。在许多情况下，不平衡数据集中的多数类都能获得较高的准确率。但是，对于少数群体来说，类级的性能却很差。对于不平衡数据集来说，类平衡在减少预测偏差方面也将发挥重要作用。本文提出了一种基于深度学习的混合入侵检测（HDLBID）框架，与 CNN-BiLSTM 相结合。在拟议的 HDLBID 框架中，使用了四种技术，即随机过度采样（ROS）、ADASYN、SMOTE 和 SMOTE-Tomek，来实现类平衡。使用 SMOTE-Tomek 的拟议 HDLBID 在 NSL-KDD 中的总体准确率达到 99.6%，在 UNSW-NB15 中达到 89.02%。与现有的相关模型相比，NSL-KDD 提高了 13.67%，UNSW-NB15 提高了 10.62%。在拟议的 HDLBID 中，除了总体准确率外，还计算了类级的 F1 分数。比较研究显示了平衡数据集与不平衡数据集的有效性，并观察到 SMOTE-Tomek 类别平衡相对表现良好。在 NSL-KDD 数据集的 U2R 类中观察到 37.43% 的改进，在 UNSW-NB15 数据集的 Worms 类中观察到 61.65% 的改进，这两个数据集都采用了 SMOTE-Tomek 类平衡。因此，与现有的相关方法相比，采用 SMOTE-Tomek 类别平衡技术的 HDLBID 在总体准确率方面取得了最佳结果。此外，在类级分析方面，HDLBID 与 SMOTE-Tomek 在不平衡版本的数据集上也取得了最佳结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Attenuating majority attack class bias using hybrid deep learning based IDS framework

In real-time application domains, like finance, healthcare and defence, delay in service or stealing information may lead to unrecoverable consequences. So, early detection of intrusion is important to prevent security breaches. In recent days, anomaly-based intrusion detection using Hybrid Deep Learning approaches are becoming more popular. The most used benchmark datasets in the literature are NSL-KDD and UNSW-NB15, and these datasets are imbalanced. The models built on imbalanced datasets may lead to biased results towards majority classes by neglecting the minority class, even though they are equally important. In many cases, high accuracy is achieved for majority classes in the imbalanced datasets. But, the class-level performances are poor with respect to the minority class. The class balancing will also play an important role in attenuating the bias in prediction for imbalanced datasets. In this paper, a Hybrid Deep Learning Based Intrusion Detection (HDLBID) framework is proposed with CNN-BiLSTM combination. The four techniques, namely, Random Oversampling (ROS), ADASYN, SMOTE, and SMOTE-Tomek, are used for class balancing in the proposed HDLBID framework. The proposed HDLBID with SMOTE-Tomek achieves an overall accuracy of 99.6% with NSL-KDD and 89.02% for UNSW-NB15. It results in an improvement of 13.67% for NSL-KDD and 10.62% for UNSW-NB15 over the existing recent related models. In the proposed HDLBID, in addition to overall accuracy, the class-level $F_{1}$ score is also calculated. A comparative study is presented to show the effectiveness of balancing dataset compared to imbalanced dataset, and observed that the SMOTE-Tomek class balancing comparatively performed well. An improvement of 37.43% is observed in the U2R class of the NSL-KDD dataset and 61.65% improvement is seen in the Worms class of the UNSW-NB15 dataset, both with SMOTE-Tomek class balancing. Therefore, the proposed HDLBID with SMOTE-Tomek class balancing reports the best results in terms of overall accuracy compared to existing recent related approaches. Also, in terms of class-level analysis, HDLBID reports best results with SMOTE-Tomek over imbalanced version of datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Network and Computer Applications 工程技术-计算机：跨学科应用

CiteScore

21.50

自引率

3.40%

发文量

142

审稿时长

37 days

期刊介绍： The Journal of Network and Computer Applications welcomes research contributions, surveys, and notes in all areas relating to computer networks and applications thereof. Sample topics include new design techniques, interesting or novel applications, components or standards; computer networks with tools such as WWW; emerging standards for internet protocols; Wireless networks; Mobile Computing; emerging computing models such as cloud computing, grid computing; applications of networked systems for remote collaboration and telemedicine, etc. The journal is abstracted and indexed in Scopus, Engineering Index, Web of Science, Science Citation Index Expanded and INSPEC.