{"title":"Feature Engineering in Machine Learning-Based Intrusion Detection Systems for OT Networks","authors":"Alex Howe, M. Papa","doi":"10.1109/SMARTCOMP58114.2023.00086","DOIUrl":null,"url":null,"abstract":"This paper evaluates the importance of feature exploration and engineering when applying machine learning for intrusion detection in OT (Operational Technology) networks. Data used consisted of raw network traffic captures from a simulated OT environment communicating over the Modbus/TCP protocol. Feature engineering efforts identified thirty eight attributes of interest at the different layers of the network stack. The Random Forest algorithm was used to analyze the importance of each feature for the detection of anomalous network behavior. Both supervised and unsupervised learning methods were evaluated including Random Forest, Support Vector Machines, K-Nearest Neighbors, K-Means Clustering, and Isolation Forest. Results indicate that statistical based features as well as features derived from the protocol and application layers contained information best suited for detecting anomalous OT behavior. Additionally, variable importance-based feature selection helped reduce complexity and improved detection rate when compared with models trained on the original high dimensional data. Random Forest and Support Vector Machines had the best detection performance but required a large amount of labeled data for training and validation. Notably, Isolation Forest shows potential for anomaly detection in OT networks as it requires no labeled data and produced promising results.","PeriodicalId":163556,"journal":{"name":"2023 IEEE International Conference on Smart Computing (SMARTCOMP)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Smart Computing (SMARTCOMP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMARTCOMP58114.2023.00086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This paper evaluates the importance of feature exploration and engineering when applying machine learning for intrusion detection in OT (Operational Technology) networks. Data used consisted of raw network traffic captures from a simulated OT environment communicating over the Modbus/TCP protocol. Feature engineering efforts identified thirty eight attributes of interest at the different layers of the network stack. The Random Forest algorithm was used to analyze the importance of each feature for the detection of anomalous network behavior. Both supervised and unsupervised learning methods were evaluated including Random Forest, Support Vector Machines, K-Nearest Neighbors, K-Means Clustering, and Isolation Forest. Results indicate that statistical based features as well as features derived from the protocol and application layers contained information best suited for detecting anomalous OT behavior. Additionally, variable importance-based feature selection helped reduce complexity and improved detection rate when compared with models trained on the original high dimensional data. Random Forest and Support Vector Machines had the best detection performance but required a large amount of labeled data for training and validation. Notably, Isolation Forest shows potential for anomaly detection in OT networks as it requires no labeled data and produced promising results.