基于HIKARI-2021数据集的网络入侵检测数据包分类:ML算法研究

2022 10th International Symposium on Digital Forensics and Security (ISDFS) Pub Date : 2022-06-06 DOI:10.1109/ISDFS55398.2022.9800807

Rui Fernandes, Nuno Lopes

{"title":"基于HIKARI-2021数据集的网络入侵检测数据包分类:ML算法研究","authors":"Rui Fernandes, Nuno Lopes","doi":"10.1109/ISDFS55398.2022.9800807","DOIUrl":null,"url":null,"abstract":"The Intrusion Detection System is a critical part of a network infrastructure to detect and prevent cyberattacks. The use of Artificial Intelligence has the potential to improve the performance of IDS in achieving cybersecurity. However, one of the challenges nowadays is the lack of good datasets that can improve the results of AI algorithms. In this paper we study the recently published HIKARI-2021 dataset, built from real data in a lab to develop network traffic and classification models. A feature selection method was used to evaluate the relevant features, and different Machine Learning methods were tested with this dataset.The results show that the dataset is suitable for classification and that the feature size of the dataset can be reduced from 83 to 22 entries, while still maintaining an accuracy of 99%, for a faster algorithm execution. When using a balanced sample of this dataset, we obtained an accuracy above 80% on some ML algorithms.","PeriodicalId":114335,"journal":{"name":"2022 10th International Symposium on Digital Forensics and Security (ISDFS)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Network Intrusion Detection Packet Classification with the HIKARI-2021 Dataset: a study on ML Algorithms\",\"authors\":\"Rui Fernandes, Nuno Lopes\",\"doi\":\"10.1109/ISDFS55398.2022.9800807\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Intrusion Detection System is a critical part of a network infrastructure to detect and prevent cyberattacks. The use of Artificial Intelligence has the potential to improve the performance of IDS in achieving cybersecurity. However, one of the challenges nowadays is the lack of good datasets that can improve the results of AI algorithms. In this paper we study the recently published HIKARI-2021 dataset, built from real data in a lab to develop network traffic and classification models. A feature selection method was used to evaluate the relevant features, and different Machine Learning methods were tested with this dataset.The results show that the dataset is suitable for classification and that the feature size of the dataset can be reduced from 83 to 22 entries, while still maintaining an accuracy of 99%, for a faster algorithm execution. When using a balanced sample of this dataset, we obtained an accuracy above 80% on some ML algorithms.\",\"PeriodicalId\":114335,\"journal\":{\"name\":\"2022 10th International Symposium on Digital Forensics and Security (ISDFS)\",\"volume\":\"63 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 10th International Symposium on Digital Forensics and Security (ISDFS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISDFS55398.2022.9800807\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 10th International Symposium on Digital Forensics and Security (ISDFS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISDFS55398.2022.9800807","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

入侵检测系统是检测和防范网络攻击的重要组成部分。人工智能的使用有可能提高IDS在实现网络安全方面的性能。然而，目前的挑战之一是缺乏可以改善人工智能算法结果的良好数据集。在本文中，我们研究了最近发布的HIKARI-2021数据集，该数据集基于实验室中的真实数据构建，用于开发网络流量和分类模型。使用特征选择方法评估相关特征，并使用该数据集测试不同的机器学习方法。结果表明，该数据集适合分类，可以将数据集的特征大小从83个条目减少到22个条目，同时仍保持99%的准确率，从而提高算法的执行速度。当使用该数据集的平衡样本时，我们在一些ML算法上获得了80%以上的准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Network Intrusion Detection Packet Classification with the HIKARI-2021 Dataset: a study on ML Algorithms

The Intrusion Detection System is a critical part of a network infrastructure to detect and prevent cyberattacks. The use of Artificial Intelligence has the potential to improve the performance of IDS in achieving cybersecurity. However, one of the challenges nowadays is the lack of good datasets that can improve the results of AI algorithms. In this paper we study the recently published HIKARI-2021 dataset, built from real data in a lab to develop network traffic and classification models. A feature selection method was used to evaluate the relevant features, and different Machine Learning methods were tested with this dataset.The results show that the dataset is suitable for classification and that the feature size of the dataset can be reduced from 83 to 22 entries, while still maintaining an accuracy of 99%, for a faster algorithm execution. When using a balanced sample of this dataset, we obtained an accuracy above 80% on some ML algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 10th International Symposium on Digital Forensics and Security (ISDFS)

自引率

0.00%

发文量