Nguyen Huu Quyen, Phan The Duy, Ngo Thao Nguyen, Nghi Hoang Khoa, Van-Hau Pham
{"title":"FedKD-IDS:基于知识提炼的半监督联合学习和反中毒攻击机制的稳健入侵检测系统","authors":"Nguyen Huu Quyen, Phan The Duy, Ngo Thao Nguyen, Nghi Hoang Khoa, Van-Hau Pham","doi":"10.1016/j.inffus.2024.102807","DOIUrl":null,"url":null,"abstract":"<div><div>In the realm of the Internet of Things (IoT), there has been a notable increase in the development and efficacy of Intrusion Detection Systems (IDS) that leverage machine learning (ML). Specifically, Federated Learning-based IDSs (FL-based IDS) have witnessed significant growth. These systems aim to mitigate data privacy breaches and minimize the communication overhead associated with dataset collection. Limited hardware resources also pose a significant constraint, preventing numerous IoT devices from actively engaging in FL. However, despite these advancements, certain challenges persist in the research domain. Issues such as elevated communication overhead, the potential for recovering private data, non-independent and identically distributed (Non-IID) data and a scarcity of labeled data remain noteworthy concerns. Additionally, vulnerabilities exist in the server-client communication during the FL process, creating opportunities for attackers to execute poisoning attacks on the client side with relative ease. To address these challenges, our paper introduces a semi-supervised approach for FL-based IDS. Our approach, named FedKD-IDS, employs knowledge distillation with a voting mechanism in place of weighted parameter aggregation and incorporates an anti-poisoning method. We conducted experiments to evaluate the effectiveness of our approach across diverse scenarios, including scenarios with Non-IID and varying data distributions. Additionally, we investigated various rates of malicious collaboration to demonstrate their impact in the federated training process. The results obtained from the real-world N-BaIoT dataset indicate that our approach surpasses the performance of the state-of-the-art (SOTA) SSFL method. Especially, even in the context of a poisoning attack where 50% of all collaborators targeted label flipping attack, FedKD-IDS demonstrated an accuracy of 79%, surpassing SSFL, which achieved only 19.86%. Furthermore, the outcomes also validated that the FedKD-IDS method has the capability to exclude over 85% of malicious collaborators during the aggregation phase of the federated training process.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102807"},"PeriodicalIF":14.7000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FedKD-IDS: A robust intrusion detection system using knowledge distillation-based semi-supervised federated learning and anti-poisoning attack mechanism\",\"authors\":\"Nguyen Huu Quyen, Phan The Duy, Ngo Thao Nguyen, Nghi Hoang Khoa, Van-Hau Pham\",\"doi\":\"10.1016/j.inffus.2024.102807\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In the realm of the Internet of Things (IoT), there has been a notable increase in the development and efficacy of Intrusion Detection Systems (IDS) that leverage machine learning (ML). Specifically, Federated Learning-based IDSs (FL-based IDS) have witnessed significant growth. These systems aim to mitigate data privacy breaches and minimize the communication overhead associated with dataset collection. Limited hardware resources also pose a significant constraint, preventing numerous IoT devices from actively engaging in FL. However, despite these advancements, certain challenges persist in the research domain. Issues such as elevated communication overhead, the potential for recovering private data, non-independent and identically distributed (Non-IID) data and a scarcity of labeled data remain noteworthy concerns. Additionally, vulnerabilities exist in the server-client communication during the FL process, creating opportunities for attackers to execute poisoning attacks on the client side with relative ease. To address these challenges, our paper introduces a semi-supervised approach for FL-based IDS. Our approach, named FedKD-IDS, employs knowledge distillation with a voting mechanism in place of weighted parameter aggregation and incorporates an anti-poisoning method. We conducted experiments to evaluate the effectiveness of our approach across diverse scenarios, including scenarios with Non-IID and varying data distributions. Additionally, we investigated various rates of malicious collaboration to demonstrate their impact in the federated training process. The results obtained from the real-world N-BaIoT dataset indicate that our approach surpasses the performance of the state-of-the-art (SOTA) SSFL method. Especially, even in the context of a poisoning attack where 50% of all collaborators targeted label flipping attack, FedKD-IDS demonstrated an accuracy of 79%, surpassing SSFL, which achieved only 19.86%. Furthermore, the outcomes also validated that the FedKD-IDS method has the capability to exclude over 85% of malicious collaborators during the aggregation phase of the federated training process.</div></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"117 \",\"pages\":\"Article 102807\"},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2024-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1566253524005852\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524005852","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
FedKD-IDS: A robust intrusion detection system using knowledge distillation-based semi-supervised federated learning and anti-poisoning attack mechanism
In the realm of the Internet of Things (IoT), there has been a notable increase in the development and efficacy of Intrusion Detection Systems (IDS) that leverage machine learning (ML). Specifically, Federated Learning-based IDSs (FL-based IDS) have witnessed significant growth. These systems aim to mitigate data privacy breaches and minimize the communication overhead associated with dataset collection. Limited hardware resources also pose a significant constraint, preventing numerous IoT devices from actively engaging in FL. However, despite these advancements, certain challenges persist in the research domain. Issues such as elevated communication overhead, the potential for recovering private data, non-independent and identically distributed (Non-IID) data and a scarcity of labeled data remain noteworthy concerns. Additionally, vulnerabilities exist in the server-client communication during the FL process, creating opportunities for attackers to execute poisoning attacks on the client side with relative ease. To address these challenges, our paper introduces a semi-supervised approach for FL-based IDS. Our approach, named FedKD-IDS, employs knowledge distillation with a voting mechanism in place of weighted parameter aggregation and incorporates an anti-poisoning method. We conducted experiments to evaluate the effectiveness of our approach across diverse scenarios, including scenarios with Non-IID and varying data distributions. Additionally, we investigated various rates of malicious collaboration to demonstrate their impact in the federated training process. The results obtained from the real-world N-BaIoT dataset indicate that our approach surpasses the performance of the state-of-the-art (SOTA) SSFL method. Especially, even in the context of a poisoning attack where 50% of all collaborators targeted label flipping attack, FedKD-IDS demonstrated an accuracy of 79%, surpassing SSFL, which achieved only 19.86%. Furthermore, the outcomes also validated that the FedKD-IDS method has the capability to exclude over 85% of malicious collaborators during the aggregation phase of the federated training process.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.