Biao Xie , Zhendong Wang , Zhiyuan Zeng , Daojing He , Sammy Chan
{"title":"DTKD-IDS: A dual-teacher knowledge distillation intrusion detection model for the industrial internet of things","authors":"Biao Xie , Zhendong Wang , Zhiyuan Zeng , Daojing He , Sammy Chan","doi":"10.1016/j.adhoc.2025.103869","DOIUrl":null,"url":null,"abstract":"<div><div>While advances in technology have brought great opportunities for the development of the Industrial Internet of Things (IIoT), cybersecurity risks are also increasing. Intrusion detection is a key technology to ensure the security and smooth operation of the Internet of Things (IoT), but owing to the resource constraints of IIoT devices, intrusion detection solutions need to be targeted and customized for the IIoT. This paper proposes a dual teacher knowledge distillation intrusion detection model called DTKD-IDS, which improves the performance of anomaly detection, accelerates the detection speed of the model, and reduces the complexity of the model. Specifically, To make the distillation process more efficient and stable, DTKD-IDS outputs a data prototype vector after each convolutional layer of the student network and the first teacher network. On the basis of these two prototype vectors, the student model extracts the most valuable knowledge from the structurally similar first teacher model. We name this process prototype distillation. In addition, we weight the extracted knowledge on the basis of the final classification loss of the two teacher networks and adaptively adjust the weights of the two teacher knowledge extractions during the training process to provide more accurate output distributions to guide the student network. This process is referred to as complementary distillation. During the training phase, we design a stable loss function to improve training efficiency. Through knowledge distillation, the model size and the number of parameters decreased by about 250 and 20 times compared to the first model, and by about 30 and 4 times compared to the second teacher model, while maintaining high detection performance. Numerous experimental results have shown that on the X-IIoTID, NSL-KDD and CICDDoS2019 datasets, the performance indicators of DTKD-IDS are improved compared with the traditional deep learning methods and the latest first-class models.</div></div>","PeriodicalId":55555,"journal":{"name":"Ad Hoc Networks","volume":"174 ","pages":"Article 103869"},"PeriodicalIF":4.4000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ad Hoc Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570870525001179","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
While advances in technology have brought great opportunities for the development of the Industrial Internet of Things (IIoT), cybersecurity risks are also increasing. Intrusion detection is a key technology to ensure the security and smooth operation of the Internet of Things (IoT), but owing to the resource constraints of IIoT devices, intrusion detection solutions need to be targeted and customized for the IIoT. This paper proposes a dual teacher knowledge distillation intrusion detection model called DTKD-IDS, which improves the performance of anomaly detection, accelerates the detection speed of the model, and reduces the complexity of the model. Specifically, To make the distillation process more efficient and stable, DTKD-IDS outputs a data prototype vector after each convolutional layer of the student network and the first teacher network. On the basis of these two prototype vectors, the student model extracts the most valuable knowledge from the structurally similar first teacher model. We name this process prototype distillation. In addition, we weight the extracted knowledge on the basis of the final classification loss of the two teacher networks and adaptively adjust the weights of the two teacher knowledge extractions during the training process to provide more accurate output distributions to guide the student network. This process is referred to as complementary distillation. During the training phase, we design a stable loss function to improve training efficiency. Through knowledge distillation, the model size and the number of parameters decreased by about 250 and 20 times compared to the first model, and by about 30 and 4 times compared to the second teacher model, while maintaining high detection performance. Numerous experimental results have shown that on the X-IIoTID, NSL-KDD and CICDDoS2019 datasets, the performance indicators of DTKD-IDS are improved compared with the traditional deep learning methods and the latest first-class models.
期刊介绍:
The Ad Hoc Networks is an international and archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in ad hoc and sensor networking areas. The Ad Hoc Networks considers original, high quality and unpublished contributions addressing all aspects of ad hoc and sensor networks. Specific areas of interest include, but are not limited to:
Mobile and Wireless Ad Hoc Networks
Sensor Networks
Wireless Local and Personal Area Networks
Home Networks
Ad Hoc Networks of Autonomous Intelligent Systems
Novel Architectures for Ad Hoc and Sensor Networks
Self-organizing Network Architectures and Protocols
Transport Layer Protocols
Routing protocols (unicast, multicast, geocast, etc.)
Media Access Control Techniques
Error Control Schemes
Power-Aware, Low-Power and Energy-Efficient Designs
Synchronization and Scheduling Issues
Mobility Management
Mobility-Tolerant Communication Protocols
Location Tracking and Location-based Services
Resource and Information Management
Security and Fault-Tolerance Issues
Hardware and Software Platforms, Systems, and Testbeds
Experimental and Prototype Results
Quality-of-Service Issues
Cross-Layer Interactions
Scalability Issues
Performance Analysis and Simulation of Protocols.