Few-Shot Class-Incremental Learning with Non-IID Decentralized Data

arXiv - CS - Machine Learning Pub Date : 2024-09-18 DOI:arxiv-2409.11657

Cuiwei Liu, Siang Xu, Huaijun Qiu, Jing Zhang, Zhi Liu, Liang Zhao

{"title":"Few-Shot Class-Incremental Learning with Non-IID Decentralized Data","authors":"Cuiwei Liu, Siang Xu, Huaijun Qiu, Jing Zhang, Zhi Liu, Liang Zhao","doi":"arxiv-2409.11657","DOIUrl":null,"url":null,"abstract":"Few-shot class-incremental learning is crucial for developing scalable and\nadaptive intelligent systems, as it enables models to acquire new classes with\nminimal annotated data while safeguarding the previously accumulated knowledge.\nNonetheless, existing methods deal with continuous data streams in a\ncentralized manner, limiting their applicability in scenarios that prioritize\ndata privacy and security. To this end, this paper introduces federated\nfew-shot class-incremental learning, a decentralized machine learning paradigm\ntailored to progressively learn new classes from scarce data distributed across\nmultiple clients. In this learning paradigm, clients locally update their\nmodels with new classes while preserving data privacy, and then transmit the\nmodel updates to a central server where they are aggregated globally. However,\nthis paradigm faces several issues, such as difficulties in few-shot learning,\ncatastrophic forgetting, and data heterogeneity. To address these challenges,\nwe present a synthetic data-driven framework that leverages replay buffer data\nto maintain existing knowledge and facilitate the acquisition of new knowledge.\nWithin this framework, a noise-aware generative replay module is developed to\nfine-tune local models with a balance of new and replay data, while generating\nsynthetic data of new classes to further expand the replay buffer for future\ntasks. Furthermore, a class-specific weighted aggregation strategy is designed\nto tackle data heterogeneity by adaptively aggregating class-specific\nparameters based on local models performance on synthetic data. This enables\neffective global model optimization without direct access to client data.\nComprehensive experiments across three widely-used datasets underscore the\neffectiveness and preeminence of the introduced framework.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"188 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11657","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Few-shot class-incremental learning is crucial for developing scalable and adaptive intelligent systems, as it enables models to acquire new classes with minimal annotated data while safeguarding the previously accumulated knowledge. Nonetheless, existing methods deal with continuous data streams in a centralized manner, limiting their applicability in scenarios that prioritize data privacy and security. To this end, this paper introduces federated few-shot class-incremental learning, a decentralized machine learning paradigm tailored to progressively learn new classes from scarce data distributed across multiple clients. In this learning paradigm, clients locally update their models with new classes while preserving data privacy, and then transmit the model updates to a central server where they are aggregated globally. However, this paradigm faces several issues, such as difficulties in few-shot learning, catastrophic forgetting, and data heterogeneity. To address these challenges, we present a synthetic data-driven framework that leverages replay buffer data to maintain existing knowledge and facilitate the acquisition of new knowledge. Within this framework, a noise-aware generative replay module is developed to fine-tune local models with a balance of new and replay data, while generating synthetic data of new classes to further expand the replay buffer for future tasks. Furthermore, a class-specific weighted aggregation strategy is designed to tackle data heterogeneity by adaptively aggregating class-specific parameters based on local models performance on synthetic data. This enables effective global model optimization without direct access to client data. Comprehensive experiments across three widely-used datasets underscore the effectiveness and preeminence of the introduced framework.

查看原文本刊更多论文

利用非 IID 分散数据进行少镜头分类增量学习

少量类递增学习对于开发可扩展和自适应的智能系统至关重要，因为它使模型能够利用最少的注释数据获取新类，同时保护先前积累的知识。然而，现有方法以集中方式处理连续数据流，限制了它们在优先考虑数据隐私和安全的场景中的适用性。为此，本文介绍了一种去中心化的机器学习范式--联合少量类递增学习（federatedfew-shot class-incremental learning），旨在从分布在多个客户端的稀缺数据中逐步学习新的类别。在这种学习范式中，客户端在保护数据隐私的前提下用新类别更新本地模型，然后将模型更新传输到中央服务器，由服务器进行全球汇总。然而，这种范例面临着一些问题，如少量学习困难、灾难性遗忘和数据异构等。为了应对这些挑战，我们提出了一个合成数据驱动框架，利用重放缓冲区数据来维护现有知识并促进新知识的获取。在这个框架内，我们开发了一个噪声感知生成重放模块，利用新数据和重放数据的平衡来精细调整本地模型，同时生成新类别的合成数据，为未来任务进一步扩展重放缓冲区。此外，还设计了一种特定类别的加权聚合策略，根据本地模型在合成数据上的表现，自适应地聚合特定类别的参数，从而解决数据异质性问题。在三个广泛使用的数据集上进行的综合实验证明了所引入框架的有效性和优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Machine Learning

自引率

0.00%

发文量