Online Continual Learning Benefits From Large Number of Task Splits

IEEE transactions on artificial intelligence Pub Date : 2024-03-27 DOI:10.1109/TAI.2024.3405404

Shilin Zhang;Chenlin Yi

{"title":"Online Continual Learning Benefits From Large Number of Task Splits","authors":"Shilin Zhang;Chenlin Yi","doi":"10.1109/TAI.2024.3405404","DOIUrl":null,"url":null,"abstract":"This work tackles the significant challenges inherent in online continual learning (OCL), a domain characterized by its handling of numerous tasks over extended periods. OCL is designed to adapt evolving data distributions and previously unseen classes through a single-pass analysis of a data stream, mirroring the dynamic nature of real-world applications. Despite its promising potential, existing OCL methodologies often suffer from catastrophic forgetting (CF) when confronted with a large array of tasks, compounded by substantial computational demands that limit their practical utility. At the heart of our proposed solution is the adoption of a kernel density estimation (KDE) learning framework, aimed at resolving the task prediction (TP) dilemma and ensuring the separability of all tasks. This is achieved through the incorporation of a linear projection head and a probability density function (PDF) for each task, while a shared backbone is maintained across tasks to provide raw feature representation. During the inference phase, we leverage an ensemble of PDFs, which utilizes a self-reporting mechanism based on maximum PDF values to identify the most appropriate model for classifying incoming instances. This strategy ensures that samples with identical labels are cohesively grouped within higher density PDF regions, effectively segregating dissimilar instances across the feature space of different tasks. Extensive experimental validation across diverse OCL datasets has underscored our framework's efficacy, showcasing remarkable performance enhancements and significant gains over existing methodologies, all achieved with minimal time-space overhead. Our approach introduces a scalable and efficient paradigm for OCL, addressing both the challenge of CF and computational efficiency, thereby extending the applicability of OCL to more realistic and demanding scenarios.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 11","pages":"5746-5759"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10539923/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This work tackles the significant challenges inherent in online continual learning (OCL), a domain characterized by its handling of numerous tasks over extended periods. OCL is designed to adapt evolving data distributions and previously unseen classes through a single-pass analysis of a data stream, mirroring the dynamic nature of real-world applications. Despite its promising potential, existing OCL methodologies often suffer from catastrophic forgetting (CF) when confronted with a large array of tasks, compounded by substantial computational demands that limit their practical utility. At the heart of our proposed solution is the adoption of a kernel density estimation (KDE) learning framework, aimed at resolving the task prediction (TP) dilemma and ensuring the separability of all tasks. This is achieved through the incorporation of a linear projection head and a probability density function (PDF) for each task, while a shared backbone is maintained across tasks to provide raw feature representation. During the inference phase, we leverage an ensemble of PDFs, which utilizes a self-reporting mechanism based on maximum PDF values to identify the most appropriate model for classifying incoming instances. This strategy ensures that samples with identical labels are cohesively grouped within higher density PDF regions, effectively segregating dissimilar instances across the feature space of different tasks. Extensive experimental validation across diverse OCL datasets has underscored our framework's efficacy, showcasing remarkable performance enhancements and significant gains over existing methodologies, all achieved with minimal time-space overhead. Our approach introduces a scalable and efficient paradigm for OCL, addressing both the challenge of CF and computational efficiency, thereby extending the applicability of OCL to more realistic and demanding scenarios.

查看原文本刊更多论文

在线持续学习受益于大量任务分拆

在线持续学习（OCL）是一个以长时间处理大量任务为特点的领域，这项工作旨在应对在线持续学习中固有的重大挑战。OCL 旨在通过对数据流的一次分析，适应不断变化的数据分布和以前未见过的类别，从而反映真实世界应用的动态性质。尽管 OCL 潜力巨大，但现有的 OCL 方法在面对大量任务时往往会出现灾难性遗忘 (CF)，再加上大量的计算需求限制了其实用性。我们提出的解决方案的核心是采用核密度估计（KDE）学习框架，旨在解决任务预测（TP）难题并确保所有任务的可分离性。这是通过为每项任务加入线性投影头和概率密度函数（PDF）来实现的，同时在各项任务中保持共享主干，以提供原始特征表示。在推理阶段，我们利用 PDF 集合，利用基于最大 PDF 值的自我报告机制来确定最适合对输入实例进行分类的模型。这一策略可确保将具有相同标签的样本凝聚到密度更高的 PDF 区域中，从而有效隔离不同任务特征空间中的异类实例。在不同的 OCL 数据集上进行的广泛实验验证证明了我们框架的功效，与现有方法相比，我们的框架显著提高了性能，并取得了巨大的收益，而所有这些都是在最小的时间空间开销下实现的。我们的方法为 OCL 引入了一种可扩展的高效范式，同时解决了 CF 和计算效率方面的挑战，从而将 OCL 的适用性扩展到了更现实、要求更高的场景中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on artificial intelligence

CiteScore

7.70

自引率

0.00%

发文量