Data-free knowledge distillation via generator-free data generation for Non-IID federated learning

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2024-08-10 DOI:10.1016/j.neunet.2024.106627

Siran Zhao , Tianchi Liao , Lele Fu , Chuan Chen , Jing Bian , Zibin Zheng

{"title":"Data-free knowledge distillation via generator-free data generation for Non-IID federated learning","authors":"Siran Zhao , Tianchi Liao , Lele Fu , Chuan Chen , Jing Bian , Zibin Zheng","doi":"10.1016/j.neunet.2024.106627","DOIUrl":null,"url":null,"abstract":"<div><p>Data heterogeneity (Non-IID) on Federated Learning (FL) is currently a widely publicized problem, which leads to local model drift and performance degradation. Because of the advantage of knowledge distillation, it has been explored in some recent work to refine global models. However, these approaches rely on a proxy dataset or a data generator. First, in many FL scenarios, proxy dataset do not necessarily exist on the server. Second, the quality of data generated by the generator is unstable and the generator depends on the computing resources of the server. In this work, we propose a novel data-Free knowledge distillation approach via generator-Free Data Generation for Non-IID FL, dubbed as FedF<sup>2</sup>DG. Specifically, FedF<sup>2</sup>DG requires only local models to generate pseudo datasets for each client, and can generate hard samples by adding an additional regularization term that exploit disagreements between local model and global model. Meanwhile, FedF<sup>2</sup>DG enables flexible utilization of computational resources by generating pseudo dataset locally or on the server. And to address the label distribution shift in Non-IID FL, we propose a Data Generation Principle that can adaptively control the label distribution and number of pseudo dataset based on client current state, and this allows for the extraction of more client knowledge. Then knowledge distillation is performed to transfer the knowledge in local models to the global model. Extensive experiments demonstrate that our proposed method significantly outperforms the state-of-the-art FL methods and can serve as plugin for existing Federated Learning methds such as FedAvg, FedProx, etc, and improve their performance.</p></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"179 ","pages":"Article 106627"},"PeriodicalIF":6.0000,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608024005513","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Data heterogeneity (Non-IID) on Federated Learning (FL) is currently a widely publicized problem, which leads to local model drift and performance degradation. Because of the advantage of knowledge distillation, it has been explored in some recent work to refine global models. However, these approaches rely on a proxy dataset or a data generator. First, in many FL scenarios, proxy dataset do not necessarily exist on the server. Second, the quality of data generated by the generator is unstable and the generator depends on the computing resources of the server. In this work, we propose a novel data-Free knowledge distillation approach via generator-Free Data Generation for Non-IID FL, dubbed as FedF²DG. Specifically, FedF²DG requires only local models to generate pseudo datasets for each client, and can generate hard samples by adding an additional regularization term that exploit disagreements between local model and global model. Meanwhile, FedF²DG enables flexible utilization of computational resources by generating pseudo dataset locally or on the server. And to address the label distribution shift in Non-IID FL, we propose a Data Generation Principle that can adaptively control the label distribution and number of pseudo dataset based on client current state, and this allows for the extraction of more client knowledge. Then knowledge distillation is performed to transfer the knowledge in local models to the global model. Extensive experiments demonstrate that our proposed method significantly outperforms the state-of-the-art FL methods and can serve as plugin for existing Federated Learning methds such as FedAvg, FedProx, etc, and improve their performance.

查看原文本刊更多论文

通过非 IID 联合学习的无生成器数据生成实现无数据知识提炼

联邦学习（FL）的数据异质性（Non-IID）目前是一个广为人知的问题，它会导致局部模型漂移和性能下降。由于知识提炼的优势，最近的一些工作已经开始探索如何完善全局模型。然而，这些方法都依赖于代理数据集或数据生成器。首先，在许多 FL 场景中，服务器上不一定存在代理数据集。其次，生成器生成的数据质量不稳定，而且生成器依赖于服务器的计算资源。在这项工作中，我们提出了一种新颖的无数据知识提炼方法，即通过生成器生成非 IID FL 的无数据知识，并将其命名为 FedF2DG。具体来说，FedF2DG 只需要局部模型就能为每个客户端生成伪数据集，并能通过添加额外的正则化项来利用局部模型和全局模型之间的差异生成硬样本。同时，FedF2DG 可以在本地或服务器上生成伪数据集，从而灵活利用计算资源。针对非 IID FL 中标签分布偏移的问题，我们提出了一种数据生成原则，可以根据客户端当前状态自适应地控制标签分布和伪数据集的数量，从而提取更多的客户端知识。然后进行知识提炼，将局部模型中的知识转移到全局模型中。大量实验证明，我们提出的方法明显优于最先进的FL方法，可以作为现有联邦学习方法（如FedAvg、FedProx等）的插件，提高它们的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.