{"title":"基于HIKARI-2021数据集的网络入侵检测数据包分类:ML算法研究","authors":"Rui Fernandes, Nuno Lopes","doi":"10.1109/ISDFS55398.2022.9800807","DOIUrl":null,"url":null,"abstract":"The Intrusion Detection System is a critical part of a network infrastructure to detect and prevent cyberattacks. The use of Artificial Intelligence has the potential to improve the performance of IDS in achieving cybersecurity. However, one of the challenges nowadays is the lack of good datasets that can improve the results of AI algorithms. In this paper we study the recently published HIKARI-2021 dataset, built from real data in a lab to develop network traffic and classification models. A feature selection method was used to evaluate the relevant features, and different Machine Learning methods were tested with this dataset.The results show that the dataset is suitable for classification and that the feature size of the dataset can be reduced from 83 to 22 entries, while still maintaining an accuracy of 99%, for a faster algorithm execution. When using a balanced sample of this dataset, we obtained an accuracy above 80% on some ML algorithms.","PeriodicalId":114335,"journal":{"name":"2022 10th International Symposium on Digital Forensics and Security (ISDFS)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Network Intrusion Detection Packet Classification with the HIKARI-2021 Dataset: a study on ML Algorithms\",\"authors\":\"Rui Fernandes, Nuno Lopes\",\"doi\":\"10.1109/ISDFS55398.2022.9800807\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Intrusion Detection System is a critical part of a network infrastructure to detect and prevent cyberattacks. The use of Artificial Intelligence has the potential to improve the performance of IDS in achieving cybersecurity. However, one of the challenges nowadays is the lack of good datasets that can improve the results of AI algorithms. In this paper we study the recently published HIKARI-2021 dataset, built from real data in a lab to develop network traffic and classification models. A feature selection method was used to evaluate the relevant features, and different Machine Learning methods were tested with this dataset.The results show that the dataset is suitable for classification and that the feature size of the dataset can be reduced from 83 to 22 entries, while still maintaining an accuracy of 99%, for a faster algorithm execution. When using a balanced sample of this dataset, we obtained an accuracy above 80% on some ML algorithms.\",\"PeriodicalId\":114335,\"journal\":{\"name\":\"2022 10th International Symposium on Digital Forensics and Security (ISDFS)\",\"volume\":\"63 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 10th International Symposium on Digital Forensics and Security (ISDFS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISDFS55398.2022.9800807\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 10th International Symposium on Digital Forensics and Security (ISDFS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISDFS55398.2022.9800807","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Network Intrusion Detection Packet Classification with the HIKARI-2021 Dataset: a study on ML Algorithms
The Intrusion Detection System is a critical part of a network infrastructure to detect and prevent cyberattacks. The use of Artificial Intelligence has the potential to improve the performance of IDS in achieving cybersecurity. However, one of the challenges nowadays is the lack of good datasets that can improve the results of AI algorithms. In this paper we study the recently published HIKARI-2021 dataset, built from real data in a lab to develop network traffic and classification models. A feature selection method was used to evaluate the relevant features, and different Machine Learning methods were tested with this dataset.The results show that the dataset is suitable for classification and that the feature size of the dataset can be reduced from 83 to 22 entries, while still maintaining an accuracy of 99%, for a faster algorithm execution. When using a balanced sample of this dataset, we obtained an accuracy above 80% on some ML algorithms.