Milan Samantaray , Ram Chandra Barik , Anil Kumar Biswal
{"title":"基于物联网的网络入侵检测系统中机器学习算法的比较评估","authors":"Milan Samantaray , Ram Chandra Barik , Anil Kumar Biswal","doi":"10.1016/j.dajour.2024.100478","DOIUrl":null,"url":null,"abstract":"<div><p>The rapid increase in online risks is a reflection of the exponential growth of Internet of Things (IoT) networks. Researchers have proposed numerous intrusion detection techniques to mitigate the harm caused by these threats. Enterprises use intrusion detection systems (IDSs) and intrusion prevention systems (IPSs) to keep their networks safe, stable, and accessible. Network intrusion detection solutions have lately integrated powerful Machine Learning (ML) techniques to safeguard IoT networks. Selecting the proper data features for effectively training such ML models is critical to maximizing detection accuracy and computational efficiency. However, the efficiency of these systems degrades in high-dimensional data spaces, and it is crucial to have a suitable feature extraction method to eliminate extraneous data from the classification procedure. The detection accuracy and false positive rate of many ML-based IDSs also rise when the samples used to train the models are unbalanced. This study provides a detailed overview of the UNSW-NB15(DS-1) and NF-UNSWNB15(DS-2) datasets for intrusion detection, which will be utilized to develop and evaluate our models. In addition, this model uses the MaxAbsScaler algorithm to implement a filter-based feature scaling strategy . Then, use the condensed feature set to perform several ML techniques, including Support Vector Machines (SVM), K-nearest neighbors (KNN), Logistic Regression (LR), Naive Bayes (NB), Decision Tree (DT), and Random Forest (RF), considering multiclass classification. Accuracy tests for the multiclass classification scheme were improved from 60% to 94% using the MaxAbsScaler-based feature scaling method.</p></div>","PeriodicalId":100357,"journal":{"name":"Decision Analytics Journal","volume":"11 ","pages":"Article 100478"},"PeriodicalIF":0.0000,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772662224000821/pdfft?md5=3385a086eaa10827b799d9bde51e99ac&pid=1-s2.0-S2772662224000821-main.pdf","citationCount":"0","resultStr":"{\"title\":\"A comparative assessment of machine learning algorithms in the IoT-based network intrusion detection systems\",\"authors\":\"Milan Samantaray , Ram Chandra Barik , Anil Kumar Biswal\",\"doi\":\"10.1016/j.dajour.2024.100478\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The rapid increase in online risks is a reflection of the exponential growth of Internet of Things (IoT) networks. Researchers have proposed numerous intrusion detection techniques to mitigate the harm caused by these threats. Enterprises use intrusion detection systems (IDSs) and intrusion prevention systems (IPSs) to keep their networks safe, stable, and accessible. Network intrusion detection solutions have lately integrated powerful Machine Learning (ML) techniques to safeguard IoT networks. Selecting the proper data features for effectively training such ML models is critical to maximizing detection accuracy and computational efficiency. However, the efficiency of these systems degrades in high-dimensional data spaces, and it is crucial to have a suitable feature extraction method to eliminate extraneous data from the classification procedure. The detection accuracy and false positive rate of many ML-based IDSs also rise when the samples used to train the models are unbalanced. This study provides a detailed overview of the UNSW-NB15(DS-1) and NF-UNSWNB15(DS-2) datasets for intrusion detection, which will be utilized to develop and evaluate our models. In addition, this model uses the MaxAbsScaler algorithm to implement a filter-based feature scaling strategy . Then, use the condensed feature set to perform several ML techniques, including Support Vector Machines (SVM), K-nearest neighbors (KNN), Logistic Regression (LR), Naive Bayes (NB), Decision Tree (DT), and Random Forest (RF), considering multiclass classification. Accuracy tests for the multiclass classification scheme were improved from 60% to 94% using the MaxAbsScaler-based feature scaling method.</p></div>\",\"PeriodicalId\":100357,\"journal\":{\"name\":\"Decision Analytics Journal\",\"volume\":\"11 \",\"pages\":\"Article 100478\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772662224000821/pdfft?md5=3385a086eaa10827b799d9bde51e99ac&pid=1-s2.0-S2772662224000821-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Decision Analytics Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772662224000821\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Decision Analytics Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772662224000821","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
在线风险的快速增长反映了物联网(IoT)网络的指数级增长。研究人员提出了许多入侵检测技术,以减轻这些威胁造成的危害。企业使用入侵检测系统(IDS)和入侵防御系统(IPS)来保证网络的安全、稳定和可访问性。最近,网络入侵检测解决方案集成了强大的机器学习(ML)技术,以保护物联网网络。选择适当的数据特征以有效训练此类 ML 模型,对于最大限度地提高检测准确性和计算效率至关重要。然而,在高维数据空间中,这些系统的效率会降低,因此必须采用合适的特征提取方法来消除分类过程中的无关数据。当用于训练模型的样本不平衡时,许多基于 ML 的 IDS 的检测准确率和误报率也会上升。本研究详细介绍了用于入侵检测的 UNSW-NB15(DS-1) 和 NF-UNSWNB15(DS-2) 数据集,我们将利用这两个数据集来开发和评估我们的模型。此外,该模型还使用 MaxAbsScaler 算法来实现基于过滤器的特征缩放策略。然后,考虑到多类分类,使用浓缩特征集执行几种 ML 技术,包括支持向量机 (SVM)、K-近邻 (KNN)、逻辑回归 (LR)、奈夫贝叶斯 (NB)、决策树 (DT) 和随机森林 (RF)。使用基于 MaxAbsScaler 的特征缩放方法,多类分类方案的准确率测试从 60% 提高到 94%。
A comparative assessment of machine learning algorithms in the IoT-based network intrusion detection systems
The rapid increase in online risks is a reflection of the exponential growth of Internet of Things (IoT) networks. Researchers have proposed numerous intrusion detection techniques to mitigate the harm caused by these threats. Enterprises use intrusion detection systems (IDSs) and intrusion prevention systems (IPSs) to keep their networks safe, stable, and accessible. Network intrusion detection solutions have lately integrated powerful Machine Learning (ML) techniques to safeguard IoT networks. Selecting the proper data features for effectively training such ML models is critical to maximizing detection accuracy and computational efficiency. However, the efficiency of these systems degrades in high-dimensional data spaces, and it is crucial to have a suitable feature extraction method to eliminate extraneous data from the classification procedure. The detection accuracy and false positive rate of many ML-based IDSs also rise when the samples used to train the models are unbalanced. This study provides a detailed overview of the UNSW-NB15(DS-1) and NF-UNSWNB15(DS-2) datasets for intrusion detection, which will be utilized to develop and evaluate our models. In addition, this model uses the MaxAbsScaler algorithm to implement a filter-based feature scaling strategy . Then, use the condensed feature set to perform several ML techniques, including Support Vector Machines (SVM), K-nearest neighbors (KNN), Logistic Regression (LR), Naive Bayes (NB), Decision Tree (DT), and Random Forest (RF), considering multiclass classification. Accuracy tests for the multiclass classification scheme were improved from 60% to 94% using the MaxAbsScaler-based feature scaling method.