{"title":"使用基于熵的掩蔽的自监督学习的低延迟和可解释的工业物联网入侵检测","authors":"Fasih Ullah Khan, Adnan Noor Mian","doi":"10.1016/j.compeleceng.2025.110753","DOIUrl":null,"url":null,"abstract":"<div><div>The Industrial Internet of Things (IIoT) has enhanced data connectivity across domains like smart city and industry. But this advancement has also created several security risks necessitating robust security measures. One critical challenge in developing effective intrusion detection systems (IDS) for IIoT is class imbalance in training datasets. In most cases, benign traffic predominates, leading to biased model training and underperformance in detecting rare attacks. To address these issues and effectively detect both normal and various attack categories, even with label scarcity and class imbalance, we propose a low-latency gradient boosting framework for efficient intrusion detection. Our approach uses Self-supervised learning (SSL) to improve efficiency and robustness. This hybrid approach employs a Masked Autoencoder (MAE) for robust representation extraction from unlabeled data, followed by classification using LightGBM. To enhance the learning capability of proposed framework, we fuse an entropy-based masking strategy within the MAE. This allows features with high uncertainty to be masked with high probability during training. This targeted feature selection enables the model to reconstruct the most informative features. As a result, the model’s robustness is improved and it can capture strong feature dependencies, even in the presence of imbalanced and label-scarce data. We validate our model’s effectiveness on three publicly available datasets i.e. BoT-IoT, ToN-IoT, and WUSTL-IIoT. Proposed framework improves inference time by a factor of 104 over State-of-The-Art (SOTA) methods. It also achieves a precision, recall and F1-score of 99%, 93% and 95% respectively which are comparable to existing SOTA methods.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"128 ","pages":"Article 110753"},"PeriodicalIF":4.9000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Low-latency and interpretable intrusion detection for IIoT using self-supervised learning with entropy-based masking\",\"authors\":\"Fasih Ullah Khan, Adnan Noor Mian\",\"doi\":\"10.1016/j.compeleceng.2025.110753\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The Industrial Internet of Things (IIoT) has enhanced data connectivity across domains like smart city and industry. But this advancement has also created several security risks necessitating robust security measures. One critical challenge in developing effective intrusion detection systems (IDS) for IIoT is class imbalance in training datasets. In most cases, benign traffic predominates, leading to biased model training and underperformance in detecting rare attacks. To address these issues and effectively detect both normal and various attack categories, even with label scarcity and class imbalance, we propose a low-latency gradient boosting framework for efficient intrusion detection. Our approach uses Self-supervised learning (SSL) to improve efficiency and robustness. This hybrid approach employs a Masked Autoencoder (MAE) for robust representation extraction from unlabeled data, followed by classification using LightGBM. To enhance the learning capability of proposed framework, we fuse an entropy-based masking strategy within the MAE. This allows features with high uncertainty to be masked with high probability during training. This targeted feature selection enables the model to reconstruct the most informative features. As a result, the model’s robustness is improved and it can capture strong feature dependencies, even in the presence of imbalanced and label-scarce data. We validate our model’s effectiveness on three publicly available datasets i.e. BoT-IoT, ToN-IoT, and WUSTL-IIoT. Proposed framework improves inference time by a factor of 104 over State-of-The-Art (SOTA) methods. It also achieves a precision, recall and F1-score of 99%, 93% and 95% respectively which are comparable to existing SOTA methods.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"128 \",\"pages\":\"Article 110753\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0045790625006962\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625006962","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Low-latency and interpretable intrusion detection for IIoT using self-supervised learning with entropy-based masking
The Industrial Internet of Things (IIoT) has enhanced data connectivity across domains like smart city and industry. But this advancement has also created several security risks necessitating robust security measures. One critical challenge in developing effective intrusion detection systems (IDS) for IIoT is class imbalance in training datasets. In most cases, benign traffic predominates, leading to biased model training and underperformance in detecting rare attacks. To address these issues and effectively detect both normal and various attack categories, even with label scarcity and class imbalance, we propose a low-latency gradient boosting framework for efficient intrusion detection. Our approach uses Self-supervised learning (SSL) to improve efficiency and robustness. This hybrid approach employs a Masked Autoencoder (MAE) for robust representation extraction from unlabeled data, followed by classification using LightGBM. To enhance the learning capability of proposed framework, we fuse an entropy-based masking strategy within the MAE. This allows features with high uncertainty to be masked with high probability during training. This targeted feature selection enables the model to reconstruct the most informative features. As a result, the model’s robustness is improved and it can capture strong feature dependencies, even in the presence of imbalanced and label-scarce data. We validate our model’s effectiveness on three publicly available datasets i.e. BoT-IoT, ToN-IoT, and WUSTL-IIoT. Proposed framework improves inference time by a factor of 104 over State-of-The-Art (SOTA) methods. It also achieves a precision, recall and F1-score of 99%, 93% and 95% respectively which are comparable to existing SOTA methods.
期刊介绍:
The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency.
Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.