{"title":"机器学习与深度学习模型在入侵检测系统二分类与多分类中的比较研究","authors":"Ayesha Alharthi, Meera Alaryani, Sanaa Kaddoura","doi":"10.1016/j.array.2025.100406","DOIUrl":null,"url":null,"abstract":"<div><div>Network infrastructure evolution has significantly expanded the attack surface, leading to increasingly complex and sophisticated cybersecurity threats. Traditional rule-based intrusion detection systems (IDS) often fail to detect emerging attack vectors, prompting the need for intelligent, data-driven approaches. This study evaluates and compares the performance of machine learning (ML) and deep learning (DL) models for network intrusion detection. Two publicly available datasets were utilized: a binary-labeled software-defined networking (SDN) dataset and a multiclass industrial control system dataset based on the IEC 60870-5-104 protocol. Preprocessing steps included normalization, label encoding, and a 70:10:20 train-validation-test split. Seven models, Random Forest, Decision Tree, K-Nearest Neighbors, XGBoost, Convolutional Neural Network, Gated Recurrent Unit, and Long Short-Term Memory, were trained and evaluated using precision, recall, and F1-score. The Random Forest model achieved the highest F1-score of 93.57 % on the IEC 60870-5-104 dataset, while XGBoost attained a near-perfect F1-score of 99.97 % on the SDN dataset. These results outperform comparable models in the literature and offer practical insights for selecting effective IDS solutions based on classification type and dataset structure.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"26 ","pages":"Article 100406"},"PeriodicalIF":2.3000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A comparative study of machine learning and deep learning models in binary and multiclass classification for intrusion detection systems\",\"authors\":\"Ayesha Alharthi, Meera Alaryani, Sanaa Kaddoura\",\"doi\":\"10.1016/j.array.2025.100406\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Network infrastructure evolution has significantly expanded the attack surface, leading to increasingly complex and sophisticated cybersecurity threats. Traditional rule-based intrusion detection systems (IDS) often fail to detect emerging attack vectors, prompting the need for intelligent, data-driven approaches. This study evaluates and compares the performance of machine learning (ML) and deep learning (DL) models for network intrusion detection. Two publicly available datasets were utilized: a binary-labeled software-defined networking (SDN) dataset and a multiclass industrial control system dataset based on the IEC 60870-5-104 protocol. Preprocessing steps included normalization, label encoding, and a 70:10:20 train-validation-test split. Seven models, Random Forest, Decision Tree, K-Nearest Neighbors, XGBoost, Convolutional Neural Network, Gated Recurrent Unit, and Long Short-Term Memory, were trained and evaluated using precision, recall, and F1-score. The Random Forest model achieved the highest F1-score of 93.57 % on the IEC 60870-5-104 dataset, while XGBoost attained a near-perfect F1-score of 99.97 % on the SDN dataset. These results outperform comparable models in the literature and offer practical insights for selecting effective IDS solutions based on classification type and dataset structure.</div></div>\",\"PeriodicalId\":8417,\"journal\":{\"name\":\"Array\",\"volume\":\"26 \",\"pages\":\"Article 100406\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-05-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Array\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2590005625000335\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Array","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590005625000335","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
A comparative study of machine learning and deep learning models in binary and multiclass classification for intrusion detection systems
Network infrastructure evolution has significantly expanded the attack surface, leading to increasingly complex and sophisticated cybersecurity threats. Traditional rule-based intrusion detection systems (IDS) often fail to detect emerging attack vectors, prompting the need for intelligent, data-driven approaches. This study evaluates and compares the performance of machine learning (ML) and deep learning (DL) models for network intrusion detection. Two publicly available datasets were utilized: a binary-labeled software-defined networking (SDN) dataset and a multiclass industrial control system dataset based on the IEC 60870-5-104 protocol. Preprocessing steps included normalization, label encoding, and a 70:10:20 train-validation-test split. Seven models, Random Forest, Decision Tree, K-Nearest Neighbors, XGBoost, Convolutional Neural Network, Gated Recurrent Unit, and Long Short-Term Memory, were trained and evaluated using precision, recall, and F1-score. The Random Forest model achieved the highest F1-score of 93.57 % on the IEC 60870-5-104 dataset, while XGBoost attained a near-perfect F1-score of 99.97 % on the SDN dataset. These results outperform comparable models in the literature and offer practical insights for selecting effective IDS solutions based on classification type and dataset structure.