使用基于熵的掩蔽的自监督学习的低延迟和可解释的工业物联网入侵检测

IF 4.9 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Fasih Ullah Khan, Adnan Noor Mian
{"title":"使用基于熵的掩蔽的自监督学习的低延迟和可解释的工业物联网入侵检测","authors":"Fasih Ullah Khan,&nbsp;Adnan Noor Mian","doi":"10.1016/j.compeleceng.2025.110753","DOIUrl":null,"url":null,"abstract":"<div><div>The Industrial Internet of Things (IIoT) has enhanced data connectivity across domains like smart city and industry. But this advancement has also created several security risks necessitating robust security measures. One critical challenge in developing effective intrusion detection systems (IDS) for IIoT is class imbalance in training datasets. In most cases, benign traffic predominates, leading to biased model training and underperformance in detecting rare attacks. To address these issues and effectively detect both normal and various attack categories, even with label scarcity and class imbalance, we propose a low-latency gradient boosting framework for efficient intrusion detection. Our approach uses Self-supervised learning (SSL) to improve efficiency and robustness. This hybrid approach employs a Masked Autoencoder (MAE) for robust representation extraction from unlabeled data, followed by classification using LightGBM. To enhance the learning capability of proposed framework, we fuse an entropy-based masking strategy within the MAE. This allows features with high uncertainty to be masked with high probability during training. This targeted feature selection enables the model to reconstruct the most informative features. As a result, the model’s robustness is improved and it can capture strong feature dependencies, even in the presence of imbalanced and label-scarce data. We validate our model’s effectiveness on three publicly available datasets i.e. BoT-IoT, ToN-IoT, and WUSTL-IIoT. Proposed framework improves inference time by a factor of 104 over State-of-The-Art (SOTA) methods. It also achieves a precision, recall and F1-score of 99%, 93% and 95% respectively which are comparable to existing SOTA methods.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"128 ","pages":"Article 110753"},"PeriodicalIF":4.9000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Low-latency and interpretable intrusion detection for IIoT using self-supervised learning with entropy-based masking\",\"authors\":\"Fasih Ullah Khan,&nbsp;Adnan Noor Mian\",\"doi\":\"10.1016/j.compeleceng.2025.110753\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The Industrial Internet of Things (IIoT) has enhanced data connectivity across domains like smart city and industry. But this advancement has also created several security risks necessitating robust security measures. One critical challenge in developing effective intrusion detection systems (IDS) for IIoT is class imbalance in training datasets. In most cases, benign traffic predominates, leading to biased model training and underperformance in detecting rare attacks. To address these issues and effectively detect both normal and various attack categories, even with label scarcity and class imbalance, we propose a low-latency gradient boosting framework for efficient intrusion detection. Our approach uses Self-supervised learning (SSL) to improve efficiency and robustness. This hybrid approach employs a Masked Autoencoder (MAE) for robust representation extraction from unlabeled data, followed by classification using LightGBM. To enhance the learning capability of proposed framework, we fuse an entropy-based masking strategy within the MAE. This allows features with high uncertainty to be masked with high probability during training. This targeted feature selection enables the model to reconstruct the most informative features. As a result, the model’s robustness is improved and it can capture strong feature dependencies, even in the presence of imbalanced and label-scarce data. We validate our model’s effectiveness on three publicly available datasets i.e. BoT-IoT, ToN-IoT, and WUSTL-IIoT. Proposed framework improves inference time by a factor of 104 over State-of-The-Art (SOTA) methods. It also achieves a precision, recall and F1-score of 99%, 93% and 95% respectively which are comparable to existing SOTA methods.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"128 \",\"pages\":\"Article 110753\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0045790625006962\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625006962","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

工业物联网(IIoT)增强了智能城市和工业等领域的数据连接。但这一进步也带来了一些安全风险,需要采取强有力的安全措施。开发有效的工业物联网入侵检测系统(IDS)的一个关键挑战是训练数据集的类不平衡。在大多数情况下,良性流量占主导地位,导致有偏见的模型训练和检测罕见攻击的性能不佳。为了解决这些问题,并有效地检测正常和各种攻击类别,即使在标签稀缺和类不平衡的情况下,我们提出了一个低延迟梯度增强框架,用于有效的入侵检测。我们的方法使用自监督学习(SSL)来提高效率和鲁棒性。这种混合方法采用掩码自动编码器(MAE)从未标记数据中提取鲁棒表示,然后使用LightGBM进行分类。为了提高框架的学习能力,我们在MAE中融合了基于熵的掩蔽策略。这使得具有高不确定性的特征在训练过程中被高概率掩盖。这种有针对性的特征选择使模型能够重建最具信息量的特征。因此,该模型的鲁棒性得到了提高,即使在存在不平衡和标签稀缺数据的情况下,它也可以捕获强特征依赖性。我们在三个公开可用的数据集上验证了模型的有效性,即BoT-IoT, ToN-IoT和WUSTL-IIoT。所提出的框架比最先进的(SOTA)方法的推理时间提高了104倍。该方法的准确率、召回率和f1分数分别达到99%、93%和95%,与现有的SOTA方法相当。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Low-latency and interpretable intrusion detection for IIoT using self-supervised learning with entropy-based masking
The Industrial Internet of Things (IIoT) has enhanced data connectivity across domains like smart city and industry. But this advancement has also created several security risks necessitating robust security measures. One critical challenge in developing effective intrusion detection systems (IDS) for IIoT is class imbalance in training datasets. In most cases, benign traffic predominates, leading to biased model training and underperformance in detecting rare attacks. To address these issues and effectively detect both normal and various attack categories, even with label scarcity and class imbalance, we propose a low-latency gradient boosting framework for efficient intrusion detection. Our approach uses Self-supervised learning (SSL) to improve efficiency and robustness. This hybrid approach employs a Masked Autoencoder (MAE) for robust representation extraction from unlabeled data, followed by classification using LightGBM. To enhance the learning capability of proposed framework, we fuse an entropy-based masking strategy within the MAE. This allows features with high uncertainty to be masked with high probability during training. This targeted feature selection enables the model to reconstruct the most informative features. As a result, the model’s robustness is improved and it can capture strong feature dependencies, even in the presence of imbalanced and label-scarce data. We validate our model’s effectiveness on three publicly available datasets i.e. BoT-IoT, ToN-IoT, and WUSTL-IIoT. Proposed framework improves inference time by a factor of 104 over State-of-The-Art (SOTA) methods. It also achieves a precision, recall and F1-score of 99%, 93% and 95% respectively which are comparable to existing SOTA methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers & Electrical Engineering
Computers & Electrical Engineering 工程技术-工程:电子与电气
CiteScore
9.20
自引率
7.00%
发文量
661
审稿时长
47 days
期刊介绍: The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency. Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信