A hybrid approach for efficient feature selection in anomaly intrusion detection for IoT networks

The Journal of Supercomputing Pub Date : 2024-08-29 DOI:10.1007/s11227-024-06409-x

Aya G. Ayad, Nehal A. Sakr, Noha A. Hikal

{"title":"A hybrid approach for efficient feature selection in anomaly intrusion detection for IoT networks","authors":"Aya G. Ayad, Nehal A. Sakr, Noha A. Hikal","doi":"10.1007/s11227-024-06409-x","DOIUrl":null,"url":null,"abstract":"<p>The exponential growth of Internet of Things (IoT) devices underscores the need for robust security measures against cyber-attacks. Extensive research in the IoT security community has centered on effective traffic detection models, with a particular focus on anomaly intrusion detection systems (AIDS). This paper specifically addresses the preprocessing stage for IoT datasets and feature selection approaches to reduce the complexity of the data. The goal is to develop an efficient AIDS that strikes a balance between high accuracy and low detection time. To achieve this goal, we propose a hybrid feature selection approach that combines filter and wrapper methods. This approach is integrated into a two-level anomaly intrusion detection system. At level 1, our approach classifies network packets into normal or attack, with level 2 further classifying the attack to determine its specific category. One critical aspect we consider is the imbalance in these datasets, which is addressed using the Synthetic Minority Over-sampling Technique (SMOTE). To evaluate how the selected features affect the performance of the machine learning model across different algorithms, namely Decision Tree, Random Forest, Gaussian Naive Bayes, and k-Nearest Neighbor, we employ benchmark datasets: BoT-IoT, TON-IoT, and CIC-DDoS2019. Evaluation metrics encompass detection accuracy, precision, recall, and F1-score. Results indicate that the decision tree achieves high detection accuracy, ranging between 99.82 and 100%, with short detection times ranging between 0.02 and 0.15 s, outperforming existing AIDS architectures for IoT networks and establishing its superiority in achieving both accuracy and efficient detection times.</p>","PeriodicalId":501596,"journal":{"name":"The Journal of Supercomputing","volume":"122 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Supercomputing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11227-024-06409-x","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The exponential growth of Internet of Things (IoT) devices underscores the need for robust security measures against cyber-attacks. Extensive research in the IoT security community has centered on effective traffic detection models, with a particular focus on anomaly intrusion detection systems (AIDS). This paper specifically addresses the preprocessing stage for IoT datasets and feature selection approaches to reduce the complexity of the data. The goal is to develop an efficient AIDS that strikes a balance between high accuracy and low detection time. To achieve this goal, we propose a hybrid feature selection approach that combines filter and wrapper methods. This approach is integrated into a two-level anomaly intrusion detection system. At level 1, our approach classifies network packets into normal or attack, with level 2 further classifying the attack to determine its specific category. One critical aspect we consider is the imbalance in these datasets, which is addressed using the Synthetic Minority Over-sampling Technique (SMOTE). To evaluate how the selected features affect the performance of the machine learning model across different algorithms, namely Decision Tree, Random Forest, Gaussian Naive Bayes, and k-Nearest Neighbor, we employ benchmark datasets: BoT-IoT, TON-IoT, and CIC-DDoS2019. Evaluation metrics encompass detection accuracy, precision, recall, and F1-score. Results indicate that the decision tree achieves high detection accuracy, ranging between 99.82 and 100%, with short detection times ranging between 0.02 and 0.15 s, outperforming existing AIDS architectures for IoT networks and establishing its superiority in achieving both accuracy and efficient detection times.

Abstract Image

查看原文本刊更多论文

物联网网络异常入侵检测中高效特征选择的混合方法

物联网（IoT）设备的指数级增长凸显了采取强有力的安全措施防范网络攻击的必要性。物联网安全领域的大量研究都集中在有效的流量检测模型上，尤其关注异常入侵检测系统（AIDS）。本文专门讨论了物联网数据集的预处理阶段以及降低数据复杂性的特征选择方法。我们的目标是开发一种高效的艾滋病检测系统，在高准确率和低检测时间之间取得平衡。为了实现这一目标，我们提出了一种混合特征选择方法，它结合了过滤器和包装方法。这种方法被集成到一个两级异常入侵检测系统中。在第一级，我们的方法将网络数据包分类为正常或攻击，第二级进一步对攻击进行分类，以确定其具体类别。我们考虑的一个重要方面是这些数据集中的不平衡，我们使用合成少数群体过度采样技术（SMOTE）来解决这个问题。为了评估所选特征如何影响机器学习模型在决策树、随机森林、高斯直觉贝叶斯和 k 近邻等不同算法中的性能，我们采用了基准数据集：我们采用了基准数据集：BoT-IoT、TON-IoT 和 CIC-DDoS2019。评估指标包括检测准确率、精确度、召回率和 F1 分数。结果表明，决策树实现了较高的检测准确率（介于 99.82 和 100%之间）和较短的检测时间（介于 0.02 和 0.15 秒之间），优于物联网网络中现有的 AIDS 架构，并确立了其在实现准确率和高效检测时间方面的优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

The Journal of Supercomputing

自引率

0.00%

发文量