{"title":"Data-free knowledge distillation via generator-free data generation for Non-IID federated learning","authors":"Siran Zhao , Tianchi Liao , Lele Fu , Chuan Chen , Jing Bian , Zibin Zheng","doi":"10.1016/j.neunet.2024.106627","DOIUrl":null,"url":null,"abstract":"<div><p>Data heterogeneity (Non-IID) on Federated Learning (FL) is currently a widely publicized problem, which leads to local model drift and performance degradation. Because of the advantage of knowledge distillation, it has been explored in some recent work to refine global models. However, these approaches rely on a proxy dataset or a data generator. First, in many FL scenarios, proxy dataset do not necessarily exist on the server. Second, the quality of data generated by the generator is unstable and the generator depends on the computing resources of the server. In this work, we propose a novel data-Free knowledge distillation approach via generator-Free Data Generation for Non-IID FL, dubbed as FedF<sup>2</sup>DG. Specifically, FedF<sup>2</sup>DG requires only local models to generate pseudo datasets for each client, and can generate hard samples by adding an additional regularization term that exploit disagreements between local model and global model. Meanwhile, FedF<sup>2</sup>DG enables flexible utilization of computational resources by generating pseudo dataset locally or on the server. And to address the label distribution shift in Non-IID FL, we propose a Data Generation Principle that can adaptively control the label distribution and number of pseudo dataset based on client current state, and this allows for the extraction of more client knowledge. Then knowledge distillation is performed to transfer the knowledge in local models to the global model. Extensive experiments demonstrate that our proposed method significantly outperforms the state-of-the-art FL methods and can serve as plugin for existing Federated Learning methds such as FedAvg, FedProx, etc, and improve their performance.</p></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"179 ","pages":"Article 106627"},"PeriodicalIF":6.0000,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608024005513","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Data heterogeneity (Non-IID) on Federated Learning (FL) is currently a widely publicized problem, which leads to local model drift and performance degradation. Because of the advantage of knowledge distillation, it has been explored in some recent work to refine global models. However, these approaches rely on a proxy dataset or a data generator. First, in many FL scenarios, proxy dataset do not necessarily exist on the server. Second, the quality of data generated by the generator is unstable and the generator depends on the computing resources of the server. In this work, we propose a novel data-Free knowledge distillation approach via generator-Free Data Generation for Non-IID FL, dubbed as FedF2DG. Specifically, FedF2DG requires only local models to generate pseudo datasets for each client, and can generate hard samples by adding an additional regularization term that exploit disagreements between local model and global model. Meanwhile, FedF2DG enables flexible utilization of computational resources by generating pseudo dataset locally or on the server. And to address the label distribution shift in Non-IID FL, we propose a Data Generation Principle that can adaptively control the label distribution and number of pseudo dataset based on client current state, and this allows for the extraction of more client knowledge. Then knowledge distillation is performed to transfer the knowledge in local models to the global model. Extensive experiments demonstrate that our proposed method significantly outperforms the state-of-the-art FL methods and can serve as plugin for existing Federated Learning methds such as FedAvg, FedProx, etc, and improve their performance.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.