DTKD-IDS: A dual-teacher knowledge distillation intrusion detection model for the industrial internet of things

IF 4.4 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Ad Hoc Networks Pub Date : 2025-04-11 DOI:10.1016/j.adhoc.2025.103869

Biao Xie , Zhendong Wang , Zhiyuan Zeng , Daojing He , Sammy Chan

{"title":"DTKD-IDS: A dual-teacher knowledge distillation intrusion detection model for the industrial internet of things","authors":"Biao Xie , Zhendong Wang , Zhiyuan Zeng , Daojing He , Sammy Chan","doi":"10.1016/j.adhoc.2025.103869","DOIUrl":null,"url":null,"abstract":"<div><div>While advances in technology have brought great opportunities for the development of the Industrial Internet of Things (IIoT), cybersecurity risks are also increasing. Intrusion detection is a key technology to ensure the security and smooth operation of the Internet of Things (IoT), but owing to the resource constraints of IIoT devices, intrusion detection solutions need to be targeted and customized for the IIoT. This paper proposes a dual teacher knowledge distillation intrusion detection model called DTKD-IDS, which improves the performance of anomaly detection, accelerates the detection speed of the model, and reduces the complexity of the model. Specifically, To make the distillation process more efficient and stable, DTKD-IDS outputs a data prototype vector after each convolutional layer of the student network and the first teacher network. On the basis of these two prototype vectors, the student model extracts the most valuable knowledge from the structurally similar first teacher model. We name this process prototype distillation. In addition, we weight the extracted knowledge on the basis of the final classification loss of the two teacher networks and adaptively adjust the weights of the two teacher knowledge extractions during the training process to provide more accurate output distributions to guide the student network. This process is referred to as complementary distillation. During the training phase, we design a stable loss function to improve training efficiency. Through knowledge distillation, the model size and the number of parameters decreased by about 250 and 20 times compared to the first model, and by about 30 and 4 times compared to the second teacher model, while maintaining high detection performance. Numerous experimental results have shown that on the X-IIoTID, NSL-KDD and CICDDoS2019 datasets, the performance indicators of DTKD-IDS are improved compared with the traditional deep learning methods and the latest first-class models.</div></div>","PeriodicalId":55555,"journal":{"name":"Ad Hoc Networks","volume":"174 ","pages":"Article 103869"},"PeriodicalIF":4.4000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ad Hoc Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570870525001179","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

While advances in technology have brought great opportunities for the development of the Industrial Internet of Things (IIoT), cybersecurity risks are also increasing. Intrusion detection is a key technology to ensure the security and smooth operation of the Internet of Things (IoT), but owing to the resource constraints of IIoT devices, intrusion detection solutions need to be targeted and customized for the IIoT. This paper proposes a dual teacher knowledge distillation intrusion detection model called DTKD-IDS, which improves the performance of anomaly detection, accelerates the detection speed of the model, and reduces the complexity of the model. Specifically, To make the distillation process more efficient and stable, DTKD-IDS outputs a data prototype vector after each convolutional layer of the student network and the first teacher network. On the basis of these two prototype vectors, the student model extracts the most valuable knowledge from the structurally similar first teacher model. We name this process prototype distillation. In addition, we weight the extracted knowledge on the basis of the final classification loss of the two teacher networks and adaptively adjust the weights of the two teacher knowledge extractions during the training process to provide more accurate output distributions to guide the student network. This process is referred to as complementary distillation. During the training phase, we design a stable loss function to improve training efficiency. Through knowledge distillation, the model size and the number of parameters decreased by about 250 and 20 times compared to the first model, and by about 30 and 4 times compared to the second teacher model, while maintaining high detection performance. Numerous experimental results have shown that on the X-IIoTID, NSL-KDD and CICDDoS2019 datasets, the performance indicators of DTKD-IDS are improved compared with the traditional deep learning methods and the latest first-class models.

查看原文本刊更多论文

DTKD-IDS：工业物联网双教师知识蒸馏入侵检测模型

技术进步为工业物联网的发展带来巨大机遇的同时，网络安全风险也在不断增加。入侵检测是保障物联网安全、平稳运行的关键技术，但受工业物联网设备资源限制，入侵检测解决方案需要针对工业物联网进行针对性定制。本文提出了双教师知识蒸馏入侵检测模型DTKD-IDS，该模型提高了异常检测的性能，加快了模型的检测速度，降低了模型的复杂度。为了使蒸馏过程更加高效和稳定，DTKD-IDS在学生网络和第一教师网络的每个卷积层之后输出一个数据原型向量。在这两个原型向量的基础上，学生模型从结构相似的第一位教师模型中提取最有价值的知识。我们称这个过程为原型蒸馏。此外，我们在两个教师网络的最终分类损失的基础上对提取的知识进行加权，并在训练过程中自适应调整两个教师知识提取的权重，以提供更准确的输出分布来指导学生网络。这个过程被称为补充蒸馏。在训练阶段，我们设计了一个稳定的损失函数来提高训练效率。通过知识蒸馏，模型大小和参数数量分别比第一个模型减少了约250倍和20倍，比第二个教师模型减少了约30倍和4倍，同时保持了较高的检测性能。大量实验结果表明，在X-IIoTID、NSL-KDD和CICDDoS2019数据集上，与传统深度学习方法和最新的一流模型相比，DTKD-IDS的性能指标得到了提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Ad Hoc Networks 工程技术-电信学

CiteScore

10.20

自引率

4.20%

发文量

131

审稿时长

4.8 months

期刊介绍： The Ad Hoc Networks is an international and archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in ad hoc and sensor networking areas. The Ad Hoc Networks considers original, high quality and unpublished contributions addressing all aspects of ad hoc and sensor networks. Specific areas of interest include, but are not limited to: Mobile and Wireless Ad Hoc Networks Sensor Networks Wireless Local and Personal Area Networks Home Networks Ad Hoc Networks of Autonomous Intelligent Systems Novel Architectures for Ad Hoc and Sensor Networks Self-organizing Network Architectures and Protocols Transport Layer Protocols Routing protocols (unicast, multicast, geocast, etc.) Media Access Control Techniques Error Control Schemes Power-Aware, Low-Power and Energy-Efficient Designs Synchronization and Scheduling Issues Mobility Management Mobility-Tolerant Communication Protocols Location Tracking and Location-based Services Resource and Information Management Security and Fault-Tolerance Issues Hardware and Software Platforms, Systems, and Testbeds Experimental and Prototype Results Quality-of-Service Issues Cross-Layer Interactions Scalability Issues Performance Analysis and Simulation of Protocols.