{"title":"物联网场景中基于机器学习的入侵检测特征选择方法的经验评估","authors":"José García, Jorge Entrena, Álvaro Alesanco","doi":"10.1016/j.iot.2024.101367","DOIUrl":null,"url":null,"abstract":"<div><div>This paper delves into the critical need for enhanced security measures within the Internet of Things (IoT) landscape due to inherent vulnerabilities in IoT devices, rendering them susceptible to various forms of cyber-attacks. The study emphasizes the importance of Intrusion Detection Systems (IDS) for continuous threat monitoring. The objective of this study was to conduct a comprehensive evaluation of feature selection (FS) methods using various machine learning (ML) techniques for classifying traffic flows within datasets containing intrusions in IoT environments. An extensive benchmark analysis of ML techniques and FS methods was performed, assessing feature selection under different approaches including Filter Feature Ranking (FFR), Filter-Feature Subset Selection (FSS), and Wrapper-based Feature Selection (WFS). FS becomes pivotal in handling vast IoT data by reducing irrelevant attributes, addressing the curse of dimensionality, enhancing model interpretability, and optimizing resources in devices with limited capacity. Key findings indicate the outperformance for traffic flows classification of certain tree-based algorithms, such as J48 or PART, against other machine learning techniques (naive Bayes, multi-layer perceptron, logistic, adaptive boosting or k-Nearest Neighbors), showcasing a good balance between performance and execution time. FS methods' advantages and drawbacks are discussed, highlighting the main differences in results obtained among different FS approaches. Filter-feature Subset Selection (FSS) approaches such as CFS could be more suitable than Filter Feature Ranking (FFR), which may select correlated attributes, or than Wrapper-based Feature Selection (WFS) methods, which may tailor attribute subsets for specific ML techniques and have lengthy execution times. In any case, reducing attributes via FS has allowed optimization of classification without compromising accuracy. In this study, F1 score classification results above 0.99, along with a reduction of over 60% in the number of attributes, have been achieved in most experiments conducted across four datasets, both in binary and multiclass modes. This work emphasizes the importance of a balanced attribute selection process, taking into account threat detection capabilities and computational complexity.</div></div>","PeriodicalId":29968,"journal":{"name":"Internet of Things","volume":"28 ","pages":"Article 101367"},"PeriodicalIF":6.0000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2542660524003081/pdfft?md5=2c59c06adc897db3e81bd94a83f7572e&pid=1-s2.0-S2542660524003081-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Empirical evaluation of feature selection methods for machine learning based intrusion detection in IoT scenarios\",\"authors\":\"José García, Jorge Entrena, Álvaro Alesanco\",\"doi\":\"10.1016/j.iot.2024.101367\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper delves into the critical need for enhanced security measures within the Internet of Things (IoT) landscape due to inherent vulnerabilities in IoT devices, rendering them susceptible to various forms of cyber-attacks. The study emphasizes the importance of Intrusion Detection Systems (IDS) for continuous threat monitoring. The objective of this study was to conduct a comprehensive evaluation of feature selection (FS) methods using various machine learning (ML) techniques for classifying traffic flows within datasets containing intrusions in IoT environments. An extensive benchmark analysis of ML techniques and FS methods was performed, assessing feature selection under different approaches including Filter Feature Ranking (FFR), Filter-Feature Subset Selection (FSS), and Wrapper-based Feature Selection (WFS). FS becomes pivotal in handling vast IoT data by reducing irrelevant attributes, addressing the curse of dimensionality, enhancing model interpretability, and optimizing resources in devices with limited capacity. Key findings indicate the outperformance for traffic flows classification of certain tree-based algorithms, such as J48 or PART, against other machine learning techniques (naive Bayes, multi-layer perceptron, logistic, adaptive boosting or k-Nearest Neighbors), showcasing a good balance between performance and execution time. FS methods' advantages and drawbacks are discussed, highlighting the main differences in results obtained among different FS approaches. Filter-feature Subset Selection (FSS) approaches such as CFS could be more suitable than Filter Feature Ranking (FFR), which may select correlated attributes, or than Wrapper-based Feature Selection (WFS) methods, which may tailor attribute subsets for specific ML techniques and have lengthy execution times. In any case, reducing attributes via FS has allowed optimization of classification without compromising accuracy. In this study, F1 score classification results above 0.99, along with a reduction of over 60% in the number of attributes, have been achieved in most experiments conducted across four datasets, both in binary and multiclass modes. This work emphasizes the importance of a balanced attribute selection process, taking into account threat detection capabilities and computational complexity.</div></div>\",\"PeriodicalId\":29968,\"journal\":{\"name\":\"Internet of Things\",\"volume\":\"28 \",\"pages\":\"Article 101367\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2024-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2542660524003081/pdfft?md5=2c59c06adc897db3e81bd94a83f7572e&pid=1-s2.0-S2542660524003081-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Internet of Things\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2542660524003081\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet of Things","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2542660524003081","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Empirical evaluation of feature selection methods for machine learning based intrusion detection in IoT scenarios
This paper delves into the critical need for enhanced security measures within the Internet of Things (IoT) landscape due to inherent vulnerabilities in IoT devices, rendering them susceptible to various forms of cyber-attacks. The study emphasizes the importance of Intrusion Detection Systems (IDS) for continuous threat monitoring. The objective of this study was to conduct a comprehensive evaluation of feature selection (FS) methods using various machine learning (ML) techniques for classifying traffic flows within datasets containing intrusions in IoT environments. An extensive benchmark analysis of ML techniques and FS methods was performed, assessing feature selection under different approaches including Filter Feature Ranking (FFR), Filter-Feature Subset Selection (FSS), and Wrapper-based Feature Selection (WFS). FS becomes pivotal in handling vast IoT data by reducing irrelevant attributes, addressing the curse of dimensionality, enhancing model interpretability, and optimizing resources in devices with limited capacity. Key findings indicate the outperformance for traffic flows classification of certain tree-based algorithms, such as J48 or PART, against other machine learning techniques (naive Bayes, multi-layer perceptron, logistic, adaptive boosting or k-Nearest Neighbors), showcasing a good balance between performance and execution time. FS methods' advantages and drawbacks are discussed, highlighting the main differences in results obtained among different FS approaches. Filter-feature Subset Selection (FSS) approaches such as CFS could be more suitable than Filter Feature Ranking (FFR), which may select correlated attributes, or than Wrapper-based Feature Selection (WFS) methods, which may tailor attribute subsets for specific ML techniques and have lengthy execution times. In any case, reducing attributes via FS has allowed optimization of classification without compromising accuracy. In this study, F1 score classification results above 0.99, along with a reduction of over 60% in the number of attributes, have been achieved in most experiments conducted across four datasets, both in binary and multiclass modes. This work emphasizes the importance of a balanced attribute selection process, taking into account threat detection capabilities and computational complexity.
期刊介绍:
Internet of Things; Engineering Cyber Physical Human Systems is a comprehensive journal encouraging cross collaboration between researchers, engineers and practitioners in the field of IoT & Cyber Physical Human Systems. The journal offers a unique platform to exchange scientific information on the entire breadth of technology, science, and societal applications of the IoT.
The journal will place a high priority on timely publication, and provide a home for high quality.
Furthermore, IOT is interested in publishing topical Special Issues on any aspect of IOT.