{"title":"面向跨设备联邦学习的分层知识蒸馏","authors":"Huy Q. Le, Loc X. Nguyen, Seong-Bae Park, C. Hong","doi":"10.1109/ICOIN56518.2023.10049011","DOIUrl":null,"url":null,"abstract":"Federated Learning (FL) has been proposed as a decentralized machine learning system where multiple clients jointly train the model without sharing private data. In FL, the statistical heterogeneity among devices has become a crucial challenge, which can cause degradation in generalization performance. Previous FL approaches have proven that leveraging the proximal regularization at the local training process can alleviate the divergence of parameter aggregation from biased local models. In this work, to address the heterogeneity issues in conventional FL, we propose a layer-wise knowledge distillation method in federated learning, namely, FedLKD, which regularizes the local training step via the knowledge distillation scheme between global and local models utilizing the small proxy dataset. Hence, FedLKD deploys the layer-wise knowledge distillation of the multiple devices and the global server as the clients’ regularized loss function. A layer-wise knowledge distillation mechanism is introduced to update the local model to exploit the common representation from different layers. Through extensive experiments, we demonstrate that FedLKD outperforms the vanilla FedAvg and FedProx on three federated datasets.","PeriodicalId":285763,"journal":{"name":"2023 International Conference on Information Networking (ICOIN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Layer-wise Knowledge Distillation for Cross-Device Federated Learning\",\"authors\":\"Huy Q. Le, Loc X. Nguyen, Seong-Bae Park, C. Hong\",\"doi\":\"10.1109/ICOIN56518.2023.10049011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated Learning (FL) has been proposed as a decentralized machine learning system where multiple clients jointly train the model without sharing private data. In FL, the statistical heterogeneity among devices has become a crucial challenge, which can cause degradation in generalization performance. Previous FL approaches have proven that leveraging the proximal regularization at the local training process can alleviate the divergence of parameter aggregation from biased local models. In this work, to address the heterogeneity issues in conventional FL, we propose a layer-wise knowledge distillation method in federated learning, namely, FedLKD, which regularizes the local training step via the knowledge distillation scheme between global and local models utilizing the small proxy dataset. Hence, FedLKD deploys the layer-wise knowledge distillation of the multiple devices and the global server as the clients’ regularized loss function. A layer-wise knowledge distillation mechanism is introduced to update the local model to exploit the common representation from different layers. Through extensive experiments, we demonstrate that FedLKD outperforms the vanilla FedAvg and FedProx on three federated datasets.\",\"PeriodicalId\":285763,\"journal\":{\"name\":\"2023 International Conference on Information Networking (ICOIN)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Conference on Information Networking (ICOIN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOIN56518.2023.10049011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Information Networking (ICOIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOIN56518.2023.10049011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Layer-wise Knowledge Distillation for Cross-Device Federated Learning
Federated Learning (FL) has been proposed as a decentralized machine learning system where multiple clients jointly train the model without sharing private data. In FL, the statistical heterogeneity among devices has become a crucial challenge, which can cause degradation in generalization performance. Previous FL approaches have proven that leveraging the proximal regularization at the local training process can alleviate the divergence of parameter aggregation from biased local models. In this work, to address the heterogeneity issues in conventional FL, we propose a layer-wise knowledge distillation method in federated learning, namely, FedLKD, which regularizes the local training step via the knowledge distillation scheme between global and local models utilizing the small proxy dataset. Hence, FedLKD deploys the layer-wise knowledge distillation of the multiple devices and the global server as the clients’ regularized loss function. A layer-wise knowledge distillation mechanism is introduced to update the local model to exploit the common representation from different layers. Through extensive experiments, we demonstrate that FedLKD outperforms the vanilla FedAvg and FedProx on three federated datasets.