无范例类增量学习中基于关联的知识提炼

IEEE Open Journal of the Computer Society Pub Date : 2025-02-28 DOI:10.1109/OJCS.2025.3546754

Zijian Gao;Bo Liu;Kele Xu;Xinjun Mao;Huaimin Wang

{"title":"无范例类增量学习中基于关联的知识提炼","authors":"Zijian Gao;Bo Liu;Kele Xu;Xinjun Mao;Huaimin Wang","doi":"10.1109/OJCS.2025.3546754","DOIUrl":null,"url":null,"abstract":"Class-incremental learning (CIL) aims to learn a family of classes incrementally with data available in order rather than training all data at once. One main drawback of CIL is that standard deep neural networks suffer from catastrophic forgetting (CF), especially when the model only has access to data from the current incremental step. Knowledge Distillation (KD) is a widely used technique that utilizes old models as the teacher model to alleviate CF. However, based on a case study, our investigation reveals that the vanilla KD is insufficient with a strict point-to-point restriction. Instead, a relaxed match between the teacher and student improves distillation performance and model stability. In this article, we propose a simple yet effective method to mitigate CF without any additional training costs or requiring any exemplars. Specifically, we apply the linear correlation between the features of the teacher and student to measure the distillation loss rather than vanilla point-to-point loss, which significantly improves the model stability. Then, we utilize label augmentation to improve feature generalization and save prototypes to alleviate classification bias further. The proposed method significantly outperforms state-of-the-art methods in the various settings of benchmarks, including CIFAR-100 and Tiny-ImageNet, demonstrating its effectiveness and robustness.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"6 ","pages":"449-459"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10908063","citationCount":"0","resultStr":"{\"title\":\"Correlation-Based Knowledge Distillation in Exemplar-Free Class-Incremental Learning\",\"authors\":\"Zijian Gao;Bo Liu;Kele Xu;Xinjun Mao;Huaimin Wang\",\"doi\":\"10.1109/OJCS.2025.3546754\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Class-incremental learning (CIL) aims to learn a family of classes incrementally with data available in order rather than training all data at once. One main drawback of CIL is that standard deep neural networks suffer from catastrophic forgetting (CF), especially when the model only has access to data from the current incremental step. Knowledge Distillation (KD) is a widely used technique that utilizes old models as the teacher model to alleviate CF. However, based on a case study, our investigation reveals that the vanilla KD is insufficient with a strict point-to-point restriction. Instead, a relaxed match between the teacher and student improves distillation performance and model stability. In this article, we propose a simple yet effective method to mitigate CF without any additional training costs or requiring any exemplars. Specifically, we apply the linear correlation between the features of the teacher and student to measure the distillation loss rather than vanilla point-to-point loss, which significantly improves the model stability. Then, we utilize label augmentation to improve feature generalization and save prototypes to alleviate classification bias further. The proposed method significantly outperforms state-of-the-art methods in the various settings of benchmarks, including CIFAR-100 and Tiny-ImageNet, demonstrating its effectiveness and robustness.\",\"PeriodicalId\":13205,\"journal\":{\"name\":\"IEEE Open Journal of the Computer Society\",\"volume\":\"6 \",\"pages\":\"449-459\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10908063\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of the Computer Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10908063/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Computer Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10908063/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

类增量学习（Class-incremental learning， CIL）的目标是使用可用数据按顺序增量学习一组类，而不是一次性训练所有数据。CIL的一个主要缺点是标准深度神经网络遭受灾难性遗忘（CF），特别是当模型只能访问当前增量步骤的数据时。知识蒸馏（Knowledge Distillation， KD）是一种被广泛使用的技术，它利用旧的模型作为教师模型来缓解CF。然而，基于一个案例研究，我们的研究表明，香草知识蒸馏（Knowledge Distillation， KD）在严格的点对点限制下是不够的。相反，教师和学生之间的轻松匹配提高了蒸馏性能和模型稳定性。在本文中，我们提出了一种简单而有效的方法来减轻CF，而不需要任何额外的培训成本或任何示例。具体来说，我们采用教师和学生特征之间的线性相关性来测量蒸馏损失，而不是普通的点对点损失，这显著提高了模型的稳定性。然后，我们利用标签增强来提高特征泛化，并利用原型保存来进一步减轻分类偏差。所提出的方法在各种基准设置（包括CIFAR-100和Tiny-ImageNet）中显著优于最先进的方法，证明了其有效性和鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Correlation-Based Knowledge Distillation in Exemplar-Free Class-Incremental Learning

Class-incremental learning (CIL) aims to learn a family of classes incrementally with data available in order rather than training all data at once. One main drawback of CIL is that standard deep neural networks suffer from catastrophic forgetting (CF), especially when the model only has access to data from the current incremental step. Knowledge Distillation (KD) is a widely used technique that utilizes old models as the teacher model to alleviate CF. However, based on a case study, our investigation reveals that the vanilla KD is insufficient with a strict point-to-point restriction. Instead, a relaxed match between the teacher and student improves distillation performance and model stability. In this article, we propose a simple yet effective method to mitigate CF without any additional training costs or requiring any exemplars. Specifically, we apply the linear correlation between the features of the teacher and student to measure the distillation loss rather than vanilla point-to-point loss, which significantly improves the model stability. Then, we utilize label augmentation to improve feature generalization and save prototypes to alleviate classification bias further. The proposed method significantly outperforms state-of-the-art methods in the various settings of benchmarks, including CIFAR-100 and Tiny-ImageNet, demonstrating its effectiveness and robustness.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Open Journal of the Computer Society

CiteScore

12.60

自引率

0.00%

发文量