Efficacy of Heterogeneous Ensemble Assisted Machine Learning Model for Binary and Multi-Class Network Intrusion Detection

2021 IEEE International Conference on Automatic Control & Intelligent Systems (I2CACIS) Pub Date : 2021-06-26 DOI:10.1109/I2CACIS52118.2021.9495864

Toya Acharya, Ishan Khatri, A. Annamalai, M. Chouikha

{"title":"Efficacy of Heterogeneous Ensemble Assisted Machine Learning Model for Binary and Multi-Class Network Intrusion Detection","authors":"Toya Acharya, Ishan Khatri, A. Annamalai, M. Chouikha","doi":"10.1109/I2CACIS52118.2021.9495864","DOIUrl":null,"url":null,"abstract":"The exponential rise in internet technologies and allied applications encompass a significantly large number of networked devices have alarmed academia-industries to achieve more effective and robust security solutions. Undeniably, digitization has led to revolution globally; however, the security threats, breaches, and subsequent losses indicate the need for a robust cybersecurity solution. Unlike classical intrusion detection systems (IDS), network IDS (NIDS) has been becoming more challenging due to continuous changes in attack-patterns and anomaly behavior. As solution data-driven machine learning methods have exhibited better by learning over network traffic information and detecting anomalies; however, its generalization over a network with both known and unknown patterns remains questionable. Moreover, most of the classical approaches fail to address the key issues of class-imbalance, level-of-significance centric feature selection, normalization and over-fitting problems resulting in different performance by varied machine learning models. In this paper, a novel and robust heterogeneous ensemble machine learning model is developed to detect anomalies in NIDS. The proposed model first applies sub-sampling to alleviate the class-imbalance problem of NIDS datasets. Subsequently, performing normalization using the Min-Max algorithm, it mapped the input data in the range of 0 to 1, thus alleviating overfitting and convergence. The feature reduction is used to reduce the features; it retained the most suitable features without imposing computational overheads, often in meta-heuristic-based approaches. Finally, the proposed NIDS solution designed a Heterogeneous ensemble learning model with J48, k-NN, SVM, Bagging, AdaBoost, and RF algorithms as base-classifier to perform two-class as well as multi-class classification over feature-selected NSL-KDD, KDD99, and UNSW-NB-15 datasets. Performance assessment in terms of true-positive rate, false positive rate and AUC revealed that the proposed NIDS model exhibited better performance than the standalone classifiers and superior to other existing anomaly detection methods.","PeriodicalId":210770,"journal":{"name":"2021 IEEE International Conference on Automatic Control & Intelligent Systems (I2CACIS)","volume":"62 11","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Automatic Control & Intelligent Systems (I2CACIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/I2CACIS52118.2021.9495864","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

The exponential rise in internet technologies and allied applications encompass a significantly large number of networked devices have alarmed academia-industries to achieve more effective and robust security solutions. Undeniably, digitization has led to revolution globally; however, the security threats, breaches, and subsequent losses indicate the need for a robust cybersecurity solution. Unlike classical intrusion detection systems (IDS), network IDS (NIDS) has been becoming more challenging due to continuous changes in attack-patterns and anomaly behavior. As solution data-driven machine learning methods have exhibited better by learning over network traffic information and detecting anomalies; however, its generalization over a network with both known and unknown patterns remains questionable. Moreover, most of the classical approaches fail to address the key issues of class-imbalance, level-of-significance centric feature selection, normalization and over-fitting problems resulting in different performance by varied machine learning models. In this paper, a novel and robust heterogeneous ensemble machine learning model is developed to detect anomalies in NIDS. The proposed model first applies sub-sampling to alleviate the class-imbalance problem of NIDS datasets. Subsequently, performing normalization using the Min-Max algorithm, it mapped the input data in the range of 0 to 1, thus alleviating overfitting and convergence. The feature reduction is used to reduce the features; it retained the most suitable features without imposing computational overheads, often in meta-heuristic-based approaches. Finally, the proposed NIDS solution designed a Heterogeneous ensemble learning model with J48, k-NN, SVM, Bagging, AdaBoost, and RF algorithms as base-classifier to perform two-class as well as multi-class classification over feature-selected NSL-KDD, KDD99, and UNSW-NB-15 datasets. Performance assessment in terms of true-positive rate, false positive rate and AUC revealed that the proposed NIDS model exhibited better performance than the standalone classifiers and superior to other existing anomaly detection methods.

查看原文本刊更多论文

异构集成辅助机器学习模型在二元和多类网络入侵检测中的有效性

互联网技术和相关应用的指数级增长涵盖了大量的网络设备，这给学术界和工业界敲响了警钟，要求他们实现更有效、更强大的安全解决方案。不可否认，数字化引发了全球革命;然而，安全威胁、漏洞和随后的损失表明需要一个强大的网络安全解决方案。与传统的入侵检测系统(IDS)不同，由于攻击模式和异常行为的不断变化，网络入侵检测系统(NIDS)变得越来越具有挑战性。作为解决方案，数据驱动的机器学习方法在学习网络流量信息和检测异常方面表现得更好;然而，它在已知和未知模式的网络上的泛化仍然值得怀疑。此外，大多数经典方法都未能解决导致不同机器学习模型性能不同的关键问题，如类别不平衡、以显著性水平为中心的特征选择、归一化和过度拟合问题。本文提出了一种新的、鲁棒的异构集成机器学习模型来检测NIDS中的异常。该模型首先采用子采样方法来缓解NIDS数据集的类不平衡问题。随后，使用Min-Max算法进行归一化，将输入数据映射到0到1的范围内，从而减轻了过拟合和收敛。特征约简用于对特征进行约简;它保留了最合适的特征，而不会增加计算开销，通常采用基于元启发式的方法。最后，提出的NIDS解决方案设计了一个异构集成学习模型，以J48、k-NN、SVM、Bagging、AdaBoost和RF算法作为基本分类器，对特征选择的NSL-KDD、KDD99和UNSW-NB-15数据集进行两类和多类分类。在真阳性率、假阳性率和AUC方面的性能评估表明，所提出的NIDS模型比独立分类器表现出更好的性能，优于其他现有的异常检测方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Conference on Automatic Control & Intelligent Systems (I2CACIS)

自引率

0.00%

发文量