Intra-class progressive and adaptive self-distillation

IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Jianping Gou , Jiaye Lin , Lin Li , Weihua Ou , Baosheng Yu , Zhang Yi
{"title":"Intra-class progressive and adaptive self-distillation","authors":"Jianping Gou ,&nbsp;Jiaye Lin ,&nbsp;Lin Li ,&nbsp;Weihua Ou ,&nbsp;Baosheng Yu ,&nbsp;Zhang Yi","doi":"10.1016/j.neunet.2025.107404","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, knowledge distillation (KD) has become widely used in compressing models, training compact and efficient students to reduce computational load and training time due to the increasing parameters in deep neural networks. To minimize training costs, self-distillation has been proposed, with methods like offline-KD and online-KD requiring pre-trained teachers and multiple networks. However, these self-distillation methods often overlook feature knowledge and category information. In this paper, we introduce Intra-class Progressive and Adaptive Self-Distillation (IPASD), which transfers knowledge from the front to the back in adjacent epochs. This method extracts class-typical features and promotes compactness within classes. By integrating feature-level and logits-level knowledge into strong teacher knowledge and using ground-truth labels as supervision signals, we adaptively optimize the model. We evaluated IPASD on CIFAR-10, CIFAR-100, Tiny ImageNet, Plant Village datasets, and ImageNet showing its superiority over state-of-the-art self-distillation methods in knowledge transfer and model compression. Our codes are available at: <span><span>https://github.com/JLinye/IPASD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"188 ","pages":"Article 107404"},"PeriodicalIF":6.0000,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025002837","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In recent years, knowledge distillation (KD) has become widely used in compressing models, training compact and efficient students to reduce computational load and training time due to the increasing parameters in deep neural networks. To minimize training costs, self-distillation has been proposed, with methods like offline-KD and online-KD requiring pre-trained teachers and multiple networks. However, these self-distillation methods often overlook feature knowledge and category information. In this paper, we introduce Intra-class Progressive and Adaptive Self-Distillation (IPASD), which transfers knowledge from the front to the back in adjacent epochs. This method extracts class-typical features and promotes compactness within classes. By integrating feature-level and logits-level knowledge into strong teacher knowledge and using ground-truth labels as supervision signals, we adaptively optimize the model. We evaluated IPASD on CIFAR-10, CIFAR-100, Tiny ImageNet, Plant Village datasets, and ImageNet showing its superiority over state-of-the-art self-distillation methods in knowledge transfer and model compression. Our codes are available at: https://github.com/JLinye/IPASD.
阶级内部渐进和适应性的自我升华
近年来,由于深度神经网络中参数的增加,知识蒸馏(knowledge distillation, KD)被广泛应用于压缩模型,训练紧凑高效的学员,以减少计算量和训练时间。为了最大限度地减少培训成本,提出了自蒸馏方法,如离线kd和在线kd等方法需要预先培训的教师和多个网络。然而,这些自蒸馏方法往往忽略了特征知识和类别信息。本文引入了类内渐进和自适应自蒸馏(IPASD),在相邻的时代将知识从前面转移到后面。该方法提取类典型特征,提高类内部的紧凑性。通过将特征级和逻辑级知识整合到强教师知识中,并使用真值标签作为监督信号,对模型进行自适应优化。我们在CIFAR-10、CIFAR-100、Tiny ImageNet、Plant Village数据集和ImageNet上对IPASD进行了评估,结果显示IPASD在知识转移和模型压缩方面优于最先进的自蒸馏方法。我们的代码可在:https://github.com/JLinye/IPASD。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Neural Networks
Neural Networks 工程技术-计算机:人工智能
CiteScore
13.90
自引率
7.70%
发文量
425
审稿时长
67 days
期刊介绍: Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信