Intra-class progressive and adaptive self-distillation

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-03-21 DOI:10.1016/j.neunet.2025.107404

Jianping Gou , Jiaye Lin , Lin Li , Weihua Ou , Baosheng Yu , Zhang Yi

{"title":"Intra-class progressive and adaptive self-distillation","authors":"Jianping Gou , Jiaye Lin , Lin Li , Weihua Ou , Baosheng Yu , Zhang Yi","doi":"10.1016/j.neunet.2025.107404","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, knowledge distillation (KD) has become widely used in compressing models, training compact and efficient students to reduce computational load and training time due to the increasing parameters in deep neural networks. To minimize training costs, self-distillation has been proposed, with methods like offline-KD and online-KD requiring pre-trained teachers and multiple networks. However, these self-distillation methods often overlook feature knowledge and category information. In this paper, we introduce Intra-class Progressive and Adaptive Self-Distillation (IPASD), which transfers knowledge from the front to the back in adjacent epochs. This method extracts class-typical features and promotes compactness within classes. By integrating feature-level and logits-level knowledge into strong teacher knowledge and using ground-truth labels as supervision signals, we adaptively optimize the model. We evaluated IPASD on CIFAR-10, CIFAR-100, Tiny ImageNet, Plant Village datasets, and ImageNet showing its superiority over state-of-the-art self-distillation methods in knowledge transfer and model compression. Our codes are available at: <span><span>https://github.com/JLinye/IPASD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"188 ","pages":"Article 107404"},"PeriodicalIF":6.0000,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025002837","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, knowledge distillation (KD) has become widely used in compressing models, training compact and efficient students to reduce computational load and training time due to the increasing parameters in deep neural networks. To minimize training costs, self-distillation has been proposed, with methods like offline-KD and online-KD requiring pre-trained teachers and multiple networks. However, these self-distillation methods often overlook feature knowledge and category information. In this paper, we introduce Intra-class Progressive and Adaptive Self-Distillation (IPASD), which transfers knowledge from the front to the back in adjacent epochs. This method extracts class-typical features and promotes compactness within classes. By integrating feature-level and logits-level knowledge into strong teacher knowledge and using ground-truth labels as supervision signals, we adaptively optimize the model. We evaluated IPASD on CIFAR-10, CIFAR-100, Tiny ImageNet, Plant Village datasets, and ImageNet showing its superiority over state-of-the-art self-distillation methods in knowledge transfer and model compression. Our codes are available at: https://github.com/JLinye/IPASD.

查看原文本刊更多论文

阶级内部渐进和适应性的自我升华

近年来，由于深度神经网络中参数的增加，知识蒸馏（knowledge distillation， KD）被广泛应用于压缩模型，训练紧凑高效的学员，以减少计算量和训练时间。为了最大限度地减少培训成本，提出了自蒸馏方法，如离线kd和在线kd等方法需要预先培训的教师和多个网络。然而，这些自蒸馏方法往往忽略了特征知识和类别信息。本文引入了类内渐进和自适应自蒸馏（IPASD），在相邻的时代将知识从前面转移到后面。该方法提取类典型特征，提高类内部的紧凑性。通过将特征级和逻辑级知识整合到强教师知识中，并使用真值标签作为监督信号，对模型进行自适应优化。我们在CIFAR-10、CIFAR-100、Tiny ImageNet、Plant Village数据集和ImageNet上对IPASD进行了评估，结果显示IPASD在知识转移和模型压缩方面优于最先进的自蒸馏方法。我们的代码可在：https://github.com/JLinye/IPASD。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.