{"title":"利用非 IID 分散数据进行少镜头分类增量学习","authors":"Cuiwei Liu, Siang Xu, Huaijun Qiu, Jing Zhang, Zhi Liu, Liang Zhao","doi":"arxiv-2409.11657","DOIUrl":null,"url":null,"abstract":"Few-shot class-incremental learning is crucial for developing scalable and\nadaptive intelligent systems, as it enables models to acquire new classes with\nminimal annotated data while safeguarding the previously accumulated knowledge.\nNonetheless, existing methods deal with continuous data streams in a\ncentralized manner, limiting their applicability in scenarios that prioritize\ndata privacy and security. To this end, this paper introduces federated\nfew-shot class-incremental learning, a decentralized machine learning paradigm\ntailored to progressively learn new classes from scarce data distributed across\nmultiple clients. In this learning paradigm, clients locally update their\nmodels with new classes while preserving data privacy, and then transmit the\nmodel updates to a central server where they are aggregated globally. However,\nthis paradigm faces several issues, such as difficulties in few-shot learning,\ncatastrophic forgetting, and data heterogeneity. To address these challenges,\nwe present a synthetic data-driven framework that leverages replay buffer data\nto maintain existing knowledge and facilitate the acquisition of new knowledge.\nWithin this framework, a noise-aware generative replay module is developed to\nfine-tune local models with a balance of new and replay data, while generating\nsynthetic data of new classes to further expand the replay buffer for future\ntasks. Furthermore, a class-specific weighted aggregation strategy is designed\nto tackle data heterogeneity by adaptively aggregating class-specific\nparameters based on local models performance on synthetic data. This enables\neffective global model optimization without direct access to client data.\nComprehensive experiments across three widely-used datasets underscore the\neffectiveness and preeminence of the introduced framework.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Few-Shot Class-Incremental Learning with Non-IID Decentralized Data\",\"authors\":\"Cuiwei Liu, Siang Xu, Huaijun Qiu, Jing Zhang, Zhi Liu, Liang Zhao\",\"doi\":\"arxiv-2409.11657\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Few-shot class-incremental learning is crucial for developing scalable and\\nadaptive intelligent systems, as it enables models to acquire new classes with\\nminimal annotated data while safeguarding the previously accumulated knowledge.\\nNonetheless, existing methods deal with continuous data streams in a\\ncentralized manner, limiting their applicability in scenarios that prioritize\\ndata privacy and security. To this end, this paper introduces federated\\nfew-shot class-incremental learning, a decentralized machine learning paradigm\\ntailored to progressively learn new classes from scarce data distributed across\\nmultiple clients. In this learning paradigm, clients locally update their\\nmodels with new classes while preserving data privacy, and then transmit the\\nmodel updates to a central server where they are aggregated globally. However,\\nthis paradigm faces several issues, such as difficulties in few-shot learning,\\ncatastrophic forgetting, and data heterogeneity. To address these challenges,\\nwe present a synthetic data-driven framework that leverages replay buffer data\\nto maintain existing knowledge and facilitate the acquisition of new knowledge.\\nWithin this framework, a noise-aware generative replay module is developed to\\nfine-tune local models with a balance of new and replay data, while generating\\nsynthetic data of new classes to further expand the replay buffer for future\\ntasks. Furthermore, a class-specific weighted aggregation strategy is designed\\nto tackle data heterogeneity by adaptively aggregating class-specific\\nparameters based on local models performance on synthetic data. This enables\\neffective global model optimization without direct access to client data.\\nComprehensive experiments across three widely-used datasets underscore the\\neffectiveness and preeminence of the introduced framework.\",\"PeriodicalId\":501301,\"journal\":{\"name\":\"arXiv - CS - Machine Learning\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11657\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11657","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Few-Shot Class-Incremental Learning with Non-IID Decentralized Data
Few-shot class-incremental learning is crucial for developing scalable and
adaptive intelligent systems, as it enables models to acquire new classes with
minimal annotated data while safeguarding the previously accumulated knowledge.
Nonetheless, existing methods deal with continuous data streams in a
centralized manner, limiting their applicability in scenarios that prioritize
data privacy and security. To this end, this paper introduces federated
few-shot class-incremental learning, a decentralized machine learning paradigm
tailored to progressively learn new classes from scarce data distributed across
multiple clients. In this learning paradigm, clients locally update their
models with new classes while preserving data privacy, and then transmit the
model updates to a central server where they are aggregated globally. However,
this paradigm faces several issues, such as difficulties in few-shot learning,
catastrophic forgetting, and data heterogeneity. To address these challenges,
we present a synthetic data-driven framework that leverages replay buffer data
to maintain existing knowledge and facilitate the acquisition of new knowledge.
Within this framework, a noise-aware generative replay module is developed to
fine-tune local models with a balance of new and replay data, while generating
synthetic data of new classes to further expand the replay buffer for future
tasks. Furthermore, a class-specific weighted aggregation strategy is designed
to tackle data heterogeneity by adaptively aggregating class-specific
parameters based on local models performance on synthetic data. This enables
effective global model optimization without direct access to client data.
Comprehensive experiments across three widely-used datasets underscore the
effectiveness and preeminence of the introduced framework.