{"title":"Data-free knowledge distillation via text-noise fusion and dynamic adversarial temperature","authors":"Deheng Zeng , Zhengyang Wu , Yunwen Chen , Zhenhua Huang","doi":"10.1016/j.neunet.2025.108061","DOIUrl":null,"url":null,"abstract":"<div><div>Data-Free Knowledge Distillation (DFKD) have achieved significant breakthroughs, enabling the effective transfer of knowledge from teacher neural networks to student neural networks without reliance on original data. However, a significant challenge faced by existing methods that attempt to generate samples from random noise is that the noise lacks meaningful information, such as class-specific semantic information. Consequently, the absence of meaningful information makes it difficult for the generator to map this noise to the ground-truth data distribution, resulting in the generation of low-quality training samples. In addition, existing methods typically employ a fixed temperature for adversarial training of the generator, which limits the diversity in the difficulty of the synthesized data. In this paper, we propose Text-Noise Fusion and Dynamic Adversarial Temperature method (TNFDAT), a novel method that combines random noise with meaningful class-specific text embeddings (CSTE) as input and implements dynamic adjustment of the adversarial training temperature for the generator. In addition, we introduce an adaptive sample weighting strategy to enhance the effectiveness of knowledge distillation. CSTE is developed based on a pre-trained language model, and its significance lies in its ability to capture meaningful inter-class information, thereby enabling the generation of high-quality samples. Simultaneously, the dynamic adversarial temperature module effectively alleviates the issue of insufficient diversity in synthesized samples by precisely modulating the generator’s temperature during adversarial training, playing a key role in enhancing sample diversity. Through continuous and dynamic temperature adjustment of the generator in the adversarial training, thereby significantly improving the overall diversity of the synthesized samples. At the knowledge distillation stage, We determine the distillation weights of the synthesized samples based on the information entropy of the output from both teacher and student networks. By differentiating the contributions of different synthesized samples during the distillation process, we effectively enhance the generalization ability of the knowledge distillation framework and improve the robustness of the student network. Experiments demonstrate that our method outperforms the state-of-the-art methods across various benchmarks and pairs of teachers and students.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"193 ","pages":"Article 108061"},"PeriodicalIF":6.3000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025009414","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Data-Free Knowledge Distillation (DFKD) have achieved significant breakthroughs, enabling the effective transfer of knowledge from teacher neural networks to student neural networks without reliance on original data. However, a significant challenge faced by existing methods that attempt to generate samples from random noise is that the noise lacks meaningful information, such as class-specific semantic information. Consequently, the absence of meaningful information makes it difficult for the generator to map this noise to the ground-truth data distribution, resulting in the generation of low-quality training samples. In addition, existing methods typically employ a fixed temperature for adversarial training of the generator, which limits the diversity in the difficulty of the synthesized data. In this paper, we propose Text-Noise Fusion and Dynamic Adversarial Temperature method (TNFDAT), a novel method that combines random noise with meaningful class-specific text embeddings (CSTE) as input and implements dynamic adjustment of the adversarial training temperature for the generator. In addition, we introduce an adaptive sample weighting strategy to enhance the effectiveness of knowledge distillation. CSTE is developed based on a pre-trained language model, and its significance lies in its ability to capture meaningful inter-class information, thereby enabling the generation of high-quality samples. Simultaneously, the dynamic adversarial temperature module effectively alleviates the issue of insufficient diversity in synthesized samples by precisely modulating the generator’s temperature during adversarial training, playing a key role in enhancing sample diversity. Through continuous and dynamic temperature adjustment of the generator in the adversarial training, thereby significantly improving the overall diversity of the synthesized samples. At the knowledge distillation stage, We determine the distillation weights of the synthesized samples based on the information entropy of the output from both teacher and student networks. By differentiating the contributions of different synthesized samples during the distillation process, we effectively enhance the generalization ability of the knowledge distillation framework and improve the robustness of the student network. Experiments demonstrate that our method outperforms the state-of-the-art methods across various benchmarks and pairs of teachers and students.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.