Efficient Network Intrusion Detection Using PCA-Based Dimensionality Reduction of Features

2019 International Symposium on Networks, Computers and Communications (ISNCC) Pub Date : 2019-06-01 DOI:10.1109/ISNCC.2019.8909140

Razan Abdulhammed, M. Faezipour, Hassan Musafer, Abdel-shakour Abuzneid

{"title":"Efficient Network Intrusion Detection Using PCA-Based Dimensionality Reduction of Features","authors":"Razan Abdulhammed, M. Faezipour, Hassan Musafer, Abdel-shakour Abuzneid","doi":"10.1109/ISNCC.2019.8909140","DOIUrl":null,"url":null,"abstract":"Designing a machine learning based network intrusion detection system (IDS) with high-dimensional features can lead to prolonged classification processes. This is while low-dimensional features can reduce these processes. Moreover, classification of network traffic with imbalanced class distributions has posed a significant drawback on the performance attainable by most well-known classifiers. With the presence of imbalanced data, the known metrics may fail to provide adequate information about the performance of the classifier. This study first uses Principal Component Analysis (PCA) as a feature dimensionality reduction approach. The resulting low-dimensional features are then used to build various classifiers such as Random Forest (RF), Bayesian Network, Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA) for designing an IDS. The experimental findings with low-dimensional features in binary and multi-class classification show better performance in terms of Detection Rate (DR), F-Measure, False Alarm Rate (FAR), and Accuracy. Furthermore, in this paper, we apply a Multi-Class Combined performance metric Combi ned Mc with respect to class distribution through incorporating FAR, DR, Accuracy, and class distribution parameters. In addition, we developed a uniform distribution based balancing approach to handle the imbalanced distribution of the minority class instances in the CICIDS2017 network intrusion dataset. We were able to reduce the CICIDS2017 dataset's feature dimensions from 81 to 10 using PCA, while maintaining a high accuracy of 99.6% in multi-class and binary classification.","PeriodicalId":187178,"journal":{"name":"2019 International Symposium on Networks, Computers and Communications (ISNCC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Symposium on Networks, Computers and Communications (ISNCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISNCC.2019.8909140","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

Abstract

Designing a machine learning based network intrusion detection system (IDS) with high-dimensional features can lead to prolonged classification processes. This is while low-dimensional features can reduce these processes. Moreover, classification of network traffic with imbalanced class distributions has posed a significant drawback on the performance attainable by most well-known classifiers. With the presence of imbalanced data, the known metrics may fail to provide adequate information about the performance of the classifier. This study first uses Principal Component Analysis (PCA) as a feature dimensionality reduction approach. The resulting low-dimensional features are then used to build various classifiers such as Random Forest (RF), Bayesian Network, Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA) for designing an IDS. The experimental findings with low-dimensional features in binary and multi-class classification show better performance in terms of Detection Rate (DR), F-Measure, False Alarm Rate (FAR), and Accuracy. Furthermore, in this paper, we apply a Multi-Class Combined performance metric Combi ned Mc with respect to class distribution through incorporating FAR, DR, Accuracy, and class distribution parameters. In addition, we developed a uniform distribution based balancing approach to handle the imbalanced distribution of the minority class instances in the CICIDS2017 network intrusion dataset. We were able to reduce the CICIDS2017 dataset's feature dimensions from 81 to 10 using PCA, while maintaining a high accuracy of 99.6% in multi-class and binary classification.

查看原文本刊更多论文

基于pca特征降维的高效网络入侵检测

设计一个基于机器学习的具有高维特征的网络入侵检测系统(IDS)会导致分类过程的延长。而低维特征可以减少这些过程。此外，对类分布不平衡的网络流量进行分类对大多数知名分类器所能达到的性能造成了严重的缺点。由于存在不平衡的数据，已知的度量可能无法提供关于分类器性能的足够信息。本研究首先使用主成分分析(PCA)作为特征降维方法。然后使用所得的低维特征构建各种分类器，如随机森林(RF)，贝叶斯网络，线性判别分析(LDA)和二次判别分析(QDA)来设计IDS。实验结果表明，在二值分类和多类分类中，低维特征在检测率(Detection Rate, DR)、F-Measure、虚警率(False Alarm Rate, FAR)和准确率方面表现出更好的性能。此外，在本文中，我们通过结合FAR, DR, Accuracy和类分布参数，对类分布应用了一个多类组合性能指标combned Mc。此外，我们开发了一种基于均匀分布的平衡方法来处理CICIDS2017网络入侵数据集中少数类实例的不平衡分布。我们能够使用PCA将CICIDS2017数据集的特征维数从81降至10，同时在多类和二元分类中保持99.6%的高精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 International Symposium on Networks, Computers and Communications (ISNCC)

自引率

0.00%

发文量