使用EC-GAN方法的NIDS低样本分类

J. Univers. Comput. Sci. Pub Date : 2022-12-28 DOI:10.3897/jucs.85703

Marko Zekan, Igor Tomičić, M. Schatten

{"title":"使用EC-GAN方法的NIDS低样本分类","authors":"Marko Zekan, Igor Tomičić, M. Schatten","doi":"10.3897/jucs.85703","DOIUrl":null,"url":null,"abstract":"Numerous advanced methods have been applied throughout the years for the use in Network Intrusion Detection Systems (NIDS). Among these are various Deep Learning models, which have shown great success for attack classification. Nevertheless, false positive rate and detection rate of these systems remains a concern. This is mostly because of the low-sample, imbalanced nature of realistic datasets, which make models challenging to train.\n Considering this, we applied a novel semi-supervised EC-GAN method for network flow classifi- cation of CIC-IDS-2017 dataset. EC-GAN uses synthetic data to aid the training of a supervised classifier on low-sample data. To achieve this, we modified the original EC-GAN to work with tabular data. In our approach, WCGAN-GP is used for synthetic tabular data generation, while a simple deep neural network is used for classification. The conditional nature of WCGAN-GP diminishes the class imbalance problem, while GAN itself solves the low-sample problem. This approach was successful in generating believable synthetic data, which was consequently used for training and testing the EC-GAN.\n To obtain our results, we trained a classifier on progressively smaller versions of the CIC-DIS-2017 dataset, first via a novel EC-GAN method and then in the conventional way, without the help of synthetic data. We then compared these two sets of results with another author’s results using accuracy, false positive rate, detection rate and macro F1 score as metrics. Our results showed that supervised classifier trained with EC-GAN can achieve significant results even when trained on as little as 25% of the original imbalanced dataset.","PeriodicalId":14652,"journal":{"name":"J. Univers. Comput. Sci.","volume":"2 1","pages":"1330-1346"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Low-sample classification in NIDS using the EC-GAN method\",\"authors\":\"Marko Zekan, Igor Tomičić, M. Schatten\",\"doi\":\"10.3897/jucs.85703\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Numerous advanced methods have been applied throughout the years for the use in Network Intrusion Detection Systems (NIDS). Among these are various Deep Learning models, which have shown great success for attack classification. Nevertheless, false positive rate and detection rate of these systems remains a concern. This is mostly because of the low-sample, imbalanced nature of realistic datasets, which make models challenging to train.\\n Considering this, we applied a novel semi-supervised EC-GAN method for network flow classifi- cation of CIC-IDS-2017 dataset. EC-GAN uses synthetic data to aid the training of a supervised classifier on low-sample data. To achieve this, we modified the original EC-GAN to work with tabular data. In our approach, WCGAN-GP is used for synthetic tabular data generation, while a simple deep neural network is used for classification. The conditional nature of WCGAN-GP diminishes the class imbalance problem, while GAN itself solves the low-sample problem. This approach was successful in generating believable synthetic data, which was consequently used for training and testing the EC-GAN.\\n To obtain our results, we trained a classifier on progressively smaller versions of the CIC-DIS-2017 dataset, first via a novel EC-GAN method and then in the conventional way, without the help of synthetic data. We then compared these two sets of results with another author’s results using accuracy, false positive rate, detection rate and macro F1 score as metrics. Our results showed that supervised classifier trained with EC-GAN can achieve significant results even when trained on as little as 25% of the original imbalanced dataset.\",\"PeriodicalId\":14652,\"journal\":{\"name\":\"J. Univers. Comput. Sci.\",\"volume\":\"2 1\",\"pages\":\"1330-1346\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"J. Univers. Comput. Sci.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3897/jucs.85703\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Univers. Comput. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/jucs.85703","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

多年来，许多先进的方法被应用于网络入侵检测系统(NIDS)中。其中包括各种深度学习模型，这些模型在攻击分类方面取得了巨大成功。然而，这些系统的假阳性率和检出率仍然令人担忧。这主要是因为现实数据集的低样本，不平衡的性质，这使得模型的训练具有挑战性。考虑到这一点，我们将一种新颖的半监督EC-GAN方法应用于CIC-IDS-2017数据集的网络流分类。EC-GAN使用合成数据来帮助训练低样本数据上的监督分类器。为了实现这一点，我们修改了原始的EC-GAN来处理表格数据。在我们的方法中，wggan - gp用于合成表格数据生成，而简单的深度神经网络用于分类。wggan - gp的条件性质减少了类不平衡问题，而GAN本身解决了低样本问题。这种方法成功地生成了可信的合成数据，从而用于训练和测试EC-GAN。为了获得我们的结果，我们在逐渐缩小的CIC-DIS-2017数据集版本上训练了一个分类器，首先通过一种新的EC-GAN方法，然后在没有合成数据帮助的情况下以传统的方式训练。然后，我们将这两组结果与另一位作者的结果进行比较，以准确性、假阳性率、检出率和宏观F1评分为指标。我们的研究结果表明，使用EC-GAN训练的监督分类器即使在原始不平衡数据集的25%上训练也能取得显著的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Low-sample classification in NIDS using the EC-GAN method

Numerous advanced methods have been applied throughout the years for the use in Network Intrusion Detection Systems (NIDS). Among these are various Deep Learning models, which have shown great success for attack classification. Nevertheless, false positive rate and detection rate of these systems remains a concern. This is mostly because of the low-sample, imbalanced nature of realistic datasets, which make models challenging to train. Considering this, we applied a novel semi-supervised EC-GAN method for network flow classifi- cation of CIC-IDS-2017 dataset. EC-GAN uses synthetic data to aid the training of a supervised classifier on low-sample data. To achieve this, we modified the original EC-GAN to work with tabular data. In our approach, WCGAN-GP is used for synthetic tabular data generation, while a simple deep neural network is used for classification. The conditional nature of WCGAN-GP diminishes the class imbalance problem, while GAN itself solves the low-sample problem. This approach was successful in generating believable synthetic data, which was consequently used for training and testing the EC-GAN. To obtain our results, we trained a classifier on progressively smaller versions of the CIC-DIS-2017 dataset, first via a novel EC-GAN method and then in the conventional way, without the help of synthetic data. We then compared these two sets of results with another author’s results using accuracy, false positive rate, detection rate and macro F1 score as metrics. Our results showed that supervised classifier trained with EC-GAN can achieve significant results even when trained on as little as 25% of the original imbalanced dataset.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

J. Univers. Comput. Sci.

自引率

0.00%

发文量