Enhancing knowledge distillation via genetic recombination

IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yangjie Cao , Chuanjin Zhou , Minglin Liu , Weiqi Luo , Xiangyang Luo
{"title":"Enhancing knowledge distillation via genetic recombination","authors":"Yangjie Cao ,&nbsp;Chuanjin Zhou ,&nbsp;Minglin Liu ,&nbsp;Weiqi Luo ,&nbsp;Xiangyang Luo","doi":"10.1016/j.asoc.2025.113414","DOIUrl":null,"url":null,"abstract":"<div><div>Diverging from conventional knowledge distillation methods that solely emphasize improving the utilization of the teacher’s knowledge, this paper explores the generation of stronger student models within available knowledge. We first conceptualize the knowledge distillation process as a genetic evolution model. The student model is regarded as an independent individual, with its parameters representing the genes of that individual. These genes are partitioned into several alleles according to the architecture of the student model. Following that, we propose a universal strategy to enhance existing knowledge distillation methods by introducing genetic recombination. Prior to distillation, we initialize two independent identically distributed student models with different random seeds to obtain the first generation of genes. With each epoch of distillation, these genes evolve into the next generation. At specific generations, we randomly select one exchangeable allele from each of the two students for exchange. Our focus lies in determining the alleles to exchange and their corresponding exchange frequency (i.e., crossing-over value). This approach provides more choices and possibilities for subsequent evolution. Extensive experiments confirm the effectiveness of the strategy, demonstrating improvements across 12 distillation methods and 17 teacher–student combinations.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113414"},"PeriodicalIF":7.2000,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625007252","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Diverging from conventional knowledge distillation methods that solely emphasize improving the utilization of the teacher’s knowledge, this paper explores the generation of stronger student models within available knowledge. We first conceptualize the knowledge distillation process as a genetic evolution model. The student model is regarded as an independent individual, with its parameters representing the genes of that individual. These genes are partitioned into several alleles according to the architecture of the student model. Following that, we propose a universal strategy to enhance existing knowledge distillation methods by introducing genetic recombination. Prior to distillation, we initialize two independent identically distributed student models with different random seeds to obtain the first generation of genes. With each epoch of distillation, these genes evolve into the next generation. At specific generations, we randomly select one exchangeable allele from each of the two students for exchange. Our focus lies in determining the alleles to exchange and their corresponding exchange frequency (i.e., crossing-over value). This approach provides more choices and possibilities for subsequent evolution. Extensive experiments confirm the effectiveness of the strategy, demonstrating improvements across 12 distillation methods and 17 teacher–student combinations.
通过基因重组增强知识精馏
与传统的知识蒸馏方法不同,传统的知识蒸馏方法只强调提高教师知识的利用,本文探索在可用知识中生成更强的学生模型。我们首先将知识升华过程概念化为一个遗传进化模型。学生模型被视为一个独立的个体,其参数代表该个体的基因。根据学生模型的结构,这些基因被划分为几个等位基因。然后,我们提出了一种通用策略,通过引入基因重组来改进现有的知识蒸馏方法。在蒸馏之前,我们初始化两个具有不同随机种子的独立同分布学生模型,以获得第一代基因。每经过一次蒸馏,这些基因就会进化到下一代。在特定的世代,我们从两个学生中随机选择一个可交换的等位基因进行交换。我们的重点在于确定要交换的等位基因及其相应的交换频率(即交叉值)。这种方法为后续的进化提供了更多的选择和可能性。大量的实验证实了该策略的有效性,展示了12种蒸馏方法和17种师生组合的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Applied Soft Computing
Applied Soft Computing 工程技术-计算机:跨学科应用
CiteScore
15.80
自引率
6.90%
发文量
874
审稿时长
10.9 months
期刊介绍: Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信