Supervised contrastive learning with prototype distillation for data incremental learning

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-06-03 DOI:10.1016/j.neunet.2025.107651

Suorong Yang , Tianyue Zhang , Zhiming Xu , Peijia Li , Baile Xu , Furao Shen , Jian Zhao

{"title":"Supervised contrastive learning with prototype distillation for data incremental learning","authors":"Suorong Yang , Tianyue Zhang , Zhiming Xu , Peijia Li , Baile Xu , Furao Shen , Jian Zhao","doi":"10.1016/j.neunet.2025.107651","DOIUrl":null,"url":null,"abstract":"<div><div>The goal of Data Incremental Learning (DIL) is to enable learning from small-scale data batches from non-stationary data streams without clear task divisions. A challenge in this domain is the occurrence of catastrophic forgetting in deep neural networks. To effectively address the challenges inherent to DIL, the trained models must exhibit stability and flexibility, ensuring the retention of information from previously learned classes while adapting to incorporate new ones. Prototypes are particularly effective for classifying separable embeddings within the feature space, as they consolidate embeddings from the same class and push those from different classes further apart. This aligns with the principles of contrastive learning. In this paper, we propose Supervised Contrastive learning with the Prototype Distillation (SCPD) method for the DIL problem. First, we employ supervised contrastive loss (SCL) for model training to enhance the class separability of the extracted model representations and improve the flexibility of the model. To further mitigate the forgetting problem, we propose a prototype distillation loss (PDL), ensuring that feature representations remain close to their corresponding prototypes, enhancing the model’s stability. The integration of SCL and PDL within SCPD ensures both the stability and flexibility of the model. Experimental results demonstrate that the SCPD method outperforms prior state-of-the-art approaches across several benchmarks, including those with various imbalanced setups.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107651"},"PeriodicalIF":6.0000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025005313","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The goal of Data Incremental Learning (DIL) is to enable learning from small-scale data batches from non-stationary data streams without clear task divisions. A challenge in this domain is the occurrence of catastrophic forgetting in deep neural networks. To effectively address the challenges inherent to DIL, the trained models must exhibit stability and flexibility, ensuring the retention of information from previously learned classes while adapting to incorporate new ones. Prototypes are particularly effective for classifying separable embeddings within the feature space, as they consolidate embeddings from the same class and push those from different classes further apart. This aligns with the principles of contrastive learning. In this paper, we propose Supervised Contrastive learning with the Prototype Distillation (SCPD) method for the DIL problem. First, we employ supervised contrastive loss (SCL) for model training to enhance the class separability of the extracted model representations and improve the flexibility of the model. To further mitigate the forgetting problem, we propose a prototype distillation loss (PDL), ensuring that feature representations remain close to their corresponding prototypes, enhancing the model’s stability. The integration of SCL and PDL within SCPD ensures both the stability and flexibility of the model. Experimental results demonstrate that the SCPD method outperforms prior state-of-the-art approaches across several benchmarks, including those with various imbalanced setups.

查看原文本刊更多论文

基于原型蒸馏的数据增量学习监督对比学习

数据增量学习（Data Incremental Learning， DIL）的目标是在没有明确任务划分的情况下，从非平稳数据流中的小批量数据中学习。该领域的一个挑战是深度神经网络中灾难性遗忘的发生。为了有效地解决DIL固有的挑战，训练的模型必须表现出稳定性和灵活性，确保保留以前学习过的类的信息，同时适应新的类。原型对于分类特征空间中的可分离嵌入特别有效，因为它们整合了来自同一类的嵌入，并将来自不同类的嵌入进一步分开。这与对比学习的原则一致。本文针对DIL问题，提出了基于原型蒸馏（SCPD）的监督对比学习方法。首先，我们采用监督对比损失（SCL）进行模型训练，以增强提取的模型表示的类可分性，提高模型的灵活性。为了进一步缓解遗忘问题，我们提出了原型蒸馏损失（PDL），确保特征表示保持接近其对应的原型，增强模型的稳定性。SCPD中集成了SCL和PDL，保证了模型的稳定性和灵活性。实验结果表明，SCPD方法在多个基准测试（包括各种不平衡设置）中优于先前的最先进方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.