基于机器学习的网络入侵检测预处理影响分析

Hüseyin Güney
{"title":"基于机器学习的网络入侵检测预处理影响分析","authors":"Hüseyin Güney","doi":"10.35377/saucis...1223054","DOIUrl":null,"url":null,"abstract":"Machine learning (ML) has been frequently used to build intelligent systems in many problem domains, including cybersecurity. For malicious network activity detection, ML-based intrusion detection systems (IDSs) are promising due to their ability to classify attacks autonomously after learning process. However, this is a challenging task due to the vast number of available methods in the current literature, including ML classification algorithms and preprocessing techniques. For analysis the impact of preprocessing techniques on the ML algorithm, this study has conducted extensive experiments, using support vector machines (SVM), the classifier and the FS technique, several normalisation techniques, and a grid-search classifier optimisation algorithm. These methods were sequentially tested on three publicly available network intrusion datasets, NSL-KDD, UNSW-NB15, and CICIDS2017. Subsequently, the results were analysed to investigate the impact of each model and to extract the insights for building intelligent and efficient IDS. The results exhibited that data preprocessing significantly improves classification performance and log-scaling normalisation outperformed other techniques for intrusion detection datasets. Additionally, the results suggested that the embedded SVM-FS is accurate and classifier optimisation can improve performance of classifier-dependent FS techniques. However, feature selection in classifier optimisation is a critical problem that must be addressed. In conclusion, this study provides insights for building ML-based NIDS by revealing important information about data preprocessing.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Preprocessing Impact Analysis for Machine Learning-Based Network Intrusion Detection\",\"authors\":\"Hüseyin Güney\",\"doi\":\"10.35377/saucis...1223054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning (ML) has been frequently used to build intelligent systems in many problem domains, including cybersecurity. For malicious network activity detection, ML-based intrusion detection systems (IDSs) are promising due to their ability to classify attacks autonomously after learning process. However, this is a challenging task due to the vast number of available methods in the current literature, including ML classification algorithms and preprocessing techniques. For analysis the impact of preprocessing techniques on the ML algorithm, this study has conducted extensive experiments, using support vector machines (SVM), the classifier and the FS technique, several normalisation techniques, and a grid-search classifier optimisation algorithm. These methods were sequentially tested on three publicly available network intrusion datasets, NSL-KDD, UNSW-NB15, and CICIDS2017. Subsequently, the results were analysed to investigate the impact of each model and to extract the insights for building intelligent and efficient IDS. The results exhibited that data preprocessing significantly improves classification performance and log-scaling normalisation outperformed other techniques for intrusion detection datasets. Additionally, the results suggested that the embedded SVM-FS is accurate and classifier optimisation can improve performance of classifier-dependent FS techniques. However, feature selection in classifier optimisation is a critical problem that must be addressed. In conclusion, this study provides insights for building ML-based NIDS by revealing important information about data preprocessing.\",\"PeriodicalId\":257636,\"journal\":{\"name\":\"Sakarya University Journal of Computer and Information Sciences\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sakarya University Journal of Computer and Information Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.35377/saucis...1223054\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sakarya University Journal of Computer and Information Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.35377/saucis...1223054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

机器学习(ML)经常被用于在许多问题领域构建智能系统,包括网络安全。对于恶意网络活动检测,基于机器学习的入侵检测系统(ids)由于能够在学习过程后自主分类攻击而具有很大的应用前景。然而,由于目前文献中大量可用的方法,包括ML分类算法和预处理技术,这是一项具有挑战性的任务。为了分析预处理技术对ML算法的影响,本研究进行了广泛的实验,使用了支持向量机(SVM)、分类器和FS技术、几种归一化技术以及网格搜索分类器优化算法。这些方法依次在三个公开可用的网络入侵数据集NSL-KDD、UNSW-NB15和CICIDS2017上进行了测试。随后,对结果进行分析,以调查每个模型的影响,并提取构建智能高效IDS的见解。结果表明,数据预处理显著提高了分类性能,对数尺度规范化优于入侵检测数据集的其他技术。此外,结果表明,嵌入式SVM-FS是准确的,分类器优化可以提高分类器依赖的FS技术的性能。然而,分类器优化中的特征选择是一个必须解决的关键问题。总之,本研究通过揭示有关数据预处理的重要信息,为构建基于ml的NIDS提供了见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Preprocessing Impact Analysis for Machine Learning-Based Network Intrusion Detection
Machine learning (ML) has been frequently used to build intelligent systems in many problem domains, including cybersecurity. For malicious network activity detection, ML-based intrusion detection systems (IDSs) are promising due to their ability to classify attacks autonomously after learning process. However, this is a challenging task due to the vast number of available methods in the current literature, including ML classification algorithms and preprocessing techniques. For analysis the impact of preprocessing techniques on the ML algorithm, this study has conducted extensive experiments, using support vector machines (SVM), the classifier and the FS technique, several normalisation techniques, and a grid-search classifier optimisation algorithm. These methods were sequentially tested on three publicly available network intrusion datasets, NSL-KDD, UNSW-NB15, and CICIDS2017. Subsequently, the results were analysed to investigate the impact of each model and to extract the insights for building intelligent and efficient IDS. The results exhibited that data preprocessing significantly improves classification performance and log-scaling normalisation outperformed other techniques for intrusion detection datasets. Additionally, the results suggested that the embedded SVM-FS is accurate and classifier optimisation can improve performance of classifier-dependent FS techniques. However, feature selection in classifier optimisation is a critical problem that must be addressed. In conclusion, this study provides insights for building ML-based NIDS by revealing important information about data preprocessing.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信