PFedKD：基于未标记伪数据的物联网知识蒸馏个性化联邦学习

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Internet of Things Journal Pub Date : 2025-01-28 DOI:10.1109/JIOT.2025.3533003

Hanxi Li;Guorong Chen;Bin Wang;Zheng Chen;Yongsheng Zhu;Fuqiang Hu;Jiao Dai;Wei Wang

{"title":"PFedKD：基于未标记伪数据的物联网知识蒸馏个性化联邦学习","authors":"Hanxi Li;Guorong Chen;Bin Wang;Zheng Chen;Yongsheng Zhu;Fuqiang Hu;Jiao Dai;Wei Wang","doi":"10.1109/JIOT.2025.3533003","DOIUrl":null,"url":null,"abstract":"With the rapid advancement of wearable devices and Internet of Things (IoT) technologies, sensor data generated by edge devices has surged. This data is crucial for advancing IoT applications, including health status monitoring, abnormal behavior detection, and environmental monitoring. However, traditional centralized learning requires uploading data to a central server, raising security and privacy concerns and hindering data application. Federated learning (FL) offers a solution by enabling collaborative model training on IoT devices without transferring data from the local device. In practice, edge devices generate data that is often highly heterogeneous, making it challenging for the global FL model to capture local data distributions accurately, leading to significant performance degradation. Additionally, imbalanced edge device resources and limited bandwidth can cause data transmission delays or interruptions, impacting application feasibility. To address these issues, we propose PFedKD, a novel personalized FL algorithm based on knowledge distillation, aimed at enhancing the model’s generalization ability and reducing communication overhead in heterogeneous IoT data environments. PFedKD constructs a public dataset using unlabeled pseudo data to extract knowledge from each client, training personalized models that fit local data distributions. This method controls dataset size while enhancing performance. During communication, only logits and class prototypes are transmitted, ensuring high communication efficiency. Sharpness aware minimization is introduced in local model training to optimize generalization. Additionally, we design a weight distribution mechanism based on client sample quality evaluation that optimizes knowledge aggregation and model personalization. Extensive experiments demonstrate that PFedKD significantly outperforms state-of-the-art baselines in both learning performance and communication efficiency.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 11","pages":"16314-16324"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PFedKD: Personalized Federated Learning via Knowledge Distillation Using Unlabeled Pseudo Data for Internet of Things\",\"authors\":\"Hanxi Li;Guorong Chen;Bin Wang;Zheng Chen;Yongsheng Zhu;Fuqiang Hu;Jiao Dai;Wei Wang\",\"doi\":\"10.1109/JIOT.2025.3533003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid advancement of wearable devices and Internet of Things (IoT) technologies, sensor data generated by edge devices has surged. This data is crucial for advancing IoT applications, including health status monitoring, abnormal behavior detection, and environmental monitoring. However, traditional centralized learning requires uploading data to a central server, raising security and privacy concerns and hindering data application. Federated learning (FL) offers a solution by enabling collaborative model training on IoT devices without transferring data from the local device. In practice, edge devices generate data that is often highly heterogeneous, making it challenging for the global FL model to capture local data distributions accurately, leading to significant performance degradation. Additionally, imbalanced edge device resources and limited bandwidth can cause data transmission delays or interruptions, impacting application feasibility. To address these issues, we propose PFedKD, a novel personalized FL algorithm based on knowledge distillation, aimed at enhancing the model’s generalization ability and reducing communication overhead in heterogeneous IoT data environments. PFedKD constructs a public dataset using unlabeled pseudo data to extract knowledge from each client, training personalized models that fit local data distributions. This method controls dataset size while enhancing performance. During communication, only logits and class prototypes are transmitted, ensuring high communication efficiency. Sharpness aware minimization is introduced in local model training to optimize generalization. Additionally, we design a weight distribution mechanism based on client sample quality evaluation that optimizes knowledge aggregation and model personalization. Extensive experiments demonstrate that PFedKD significantly outperforms state-of-the-art baselines in both learning performance and communication efficiency.\",\"PeriodicalId\":54347,\"journal\":{\"name\":\"IEEE Internet of Things Journal\",\"volume\":\"12 11\",\"pages\":\"16314-16324\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-01-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Internet of Things Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10855800/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10855800/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

随着可穿戴设备和物联网（IoT）技术的快速发展，边缘设备产生的传感器数据激增。这些数据对于推进物联网应用至关重要，包括健康状态监测、异常行为检测和环境监测。然而，传统的集中式学习需要将数据上传到中央服务器，这增加了安全和隐私问题，阻碍了数据的应用。联邦学习（FL）提供了一种解决方案，通过在物联网设备上实现协作模型训练，而无需从本地设备传输数据。在实践中，边缘设备生成的数据通常是高度异构的，这使得全局FL模型难以准确捕获本地数据分布，从而导致显著的性能下降。此外，边缘设备资源不均衡，带宽有限，可能导致数据传输延迟或中断，影响应用的可行性。为了解决这些问题，我们提出了一种新的基于知识蒸馏的个性化FL算法PFedKD，旨在提高模型的泛化能力并降低异构物联网数据环境下的通信开销。PFedKD使用未标记的伪数据构建公共数据集，从每个客户端提取知识，训练适合本地数据分布的个性化模型。此方法在提高性能的同时控制数据集大小。在通信过程中，只传输逻辑和类原型，保证了较高的通信效率。在局部模型训练中引入锐度感知最小化来优化泛化。此外，我们设计了一个基于客户样本质量评估的权重分配机制，优化了知识聚合和模型个性化。大量的实验表明，PFedKD在学习性能和沟通效率方面都明显优于最先进的基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

PFedKD: Personalized Federated Learning via Knowledge Distillation Using Unlabeled Pseudo Data for Internet of Things

With the rapid advancement of wearable devices and Internet of Things (IoT) technologies, sensor data generated by edge devices has surged. This data is crucial for advancing IoT applications, including health status monitoring, abnormal behavior detection, and environmental monitoring. However, traditional centralized learning requires uploading data to a central server, raising security and privacy concerns and hindering data application. Federated learning (FL) offers a solution by enabling collaborative model training on IoT devices without transferring data from the local device. In practice, edge devices generate data that is often highly heterogeneous, making it challenging for the global FL model to capture local data distributions accurately, leading to significant performance degradation. Additionally, imbalanced edge device resources and limited bandwidth can cause data transmission delays or interruptions, impacting application feasibility. To address these issues, we propose PFedKD, a novel personalized FL algorithm based on knowledge distillation, aimed at enhancing the model’s generalization ability and reducing communication overhead in heterogeneous IoT data environments. PFedKD constructs a public dataset using unlabeled pseudo data to extract knowledge from each client, training personalized models that fit local data distributions. This method controls dataset size while enhancing performance. During communication, only logits and class prototypes are transmitted, ensuring high communication efficiency. Sharpness aware minimization is introduced in local model training to optimize generalization. Additionally, we design a weight distribution mechanism based on client sample quality evaluation that optimizes knowledge aggregation and model personalization. Extensive experiments demonstrate that PFedKD significantly outperforms state-of-the-art baselines in both learning performance and communication efficiency.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Internet of Things Journal Computer Science-Information Systems

CiteScore

17.60

自引率

13.20%

发文量

1982

期刊介绍： The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.