Less confidence, less forgetting: Learning with a humbler teacher in exemplar-free Class-Incremental learning

IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Zijian Gao , Kele Xu , Huiping Zhuang , Li Liu , Xinjun Mao , Bo Ding , Dawei Feng , Huaimin Wang
{"title":"Less confidence, less forgetting: Learning with a humbler teacher in exemplar-free Class-Incremental learning","authors":"Zijian Gao ,&nbsp;Kele Xu ,&nbsp;Huiping Zhuang ,&nbsp;Li Liu ,&nbsp;Xinjun Mao ,&nbsp;Bo Ding ,&nbsp;Dawei Feng ,&nbsp;Huaimin Wang","doi":"10.1016/j.neunet.2024.106513","DOIUrl":null,"url":null,"abstract":"<div><p>Class-Incremental learning (CIL) is challenging due to catastrophic forgetting (CF), which escalates in exemplar-free scenarios. To mitigate CF, Knowledge Distillation (KD), which leverages old models as teacher models, has been widely employed in CIL. However, based on a case study, our investigation reveals that the teacher model exhibits over-confidence in unseen new samples. In this article, we conduct empirical experiments and provide theoretical analysis to investigate the over-confident phenomenon and the impact of KD in exemplar-free CIL, where access to old samples is unavailable. Building on our analysis, we propose a novel approach, Learning with Humbler Teacher, by systematically selecting an appropriate checkpoint model as a humbler teacher to mitigate CF. Furthermore, we explore utilizing the nuclear norm to obtain an appropriate temporal ensemble to enhance model stability. Notably, LwHT outperforms the state-of-the-art approach by a significant margin of 10.41%, 6.56%, and 4.31% in various settings while demonstrating superior model plasticity.</p></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"179 ","pages":"Article 106513"},"PeriodicalIF":6.3000,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608024004374","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Class-Incremental learning (CIL) is challenging due to catastrophic forgetting (CF), which escalates in exemplar-free scenarios. To mitigate CF, Knowledge Distillation (KD), which leverages old models as teacher models, has been widely employed in CIL. However, based on a case study, our investigation reveals that the teacher model exhibits over-confidence in unseen new samples. In this article, we conduct empirical experiments and provide theoretical analysis to investigate the over-confident phenomenon and the impact of KD in exemplar-free CIL, where access to old samples is unavailable. Building on our analysis, we propose a novel approach, Learning with Humbler Teacher, by systematically selecting an appropriate checkpoint model as a humbler teacher to mitigate CF. Furthermore, we explore utilizing the nuclear norm to obtain an appropriate temporal ensemble to enhance model stability. Notably, LwHT outperforms the state-of-the-art approach by a significant margin of 10.41%, 6.56%, and 4.31% in various settings while demonstrating superior model plasticity.

更少的自信,更少的遗忘:在无范例班级强化学习中与谦虚的教师一起学习
由于灾难性遗忘(CF)会在无范例的情况下升级,因此分类增量学习(CIL)具有挑战性。为了减轻灾难性遗忘,知识蒸馏(Knowledge Distillation,KD)技术在 CIL 中得到了广泛应用,它利用旧模型作为教师模型。然而,基于一项案例研究,我们的调查发现,教师模型在未见过的新样本中表现出过度自信。在本文中,我们通过实证实验和理论分析,研究了在无法获得旧样本的无范例 CIL 中的过度自信现象和 KD 的影响。在分析的基础上,我们提出了一种新方法--"用谦虚的老师学习"(Learning with Humbler Teacher),通过系统地选择一个合适的检查点模型作为谦虚的老师来减轻过度自信现象。此外,我们还探索利用核规范来获得适当的时间集合,以增强模型的稳定性。值得注意的是,LwHT 在各种环境下的表现都明显优于最先进的方法,分别为 10.41%、6.56% 和 4.31%,同时展示了卓越的模型可塑性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Neural Networks
Neural Networks 工程技术-计算机:人工智能
CiteScore
13.90
自引率
7.70%
发文量
425
审稿时长
67 days
期刊介绍: Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信