{"title":"培养终身学习的动态生长混合模型","authors":"Fei Ye;Adrian G. Bors","doi":"10.1109/TNNLS.2025.3569156","DOIUrl":null,"url":null,"abstract":"Lifelong learning (LLL) defines a training paradigm that aims to continuously acquire and capture new concepts from a sequence of tasks without forgetting. Recently, dynamic expansion models (DEMs) have been proposed to address catastrophic forgetting under the LLL paradigm. However, the efficiency of DEMs lacks a thorough explanation based on theoretical analysis. In this article, we develop a new theoretical framework that interprets the forgetting process of the DEM as increasing the statistical discrepancy distance between the distribution of the probabilistic representation of the new data and the previously learned knowledge. The theoretical analysis shows that adding new components to a mixture model represents a trade-off between model complexity and its performance. Inspired by the theoretical analysis, we introduce a new DEM, called the growing mixture model (GMM), where generative data components are added according to the novelty of the incoming task information compared to what is already known. A new component selection mechanism considering the model’s already acquired knowledge is employed for updating new DEM’s components, promoting efficient future task learning. We also train a compact student model with samples drawn through the generative mechanisms of the GMM, aiming to accumulate cross-domain representations over time. By employing the student model, we can significantly reduce the number of parameters and make quick inferences during the testing phase.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 9","pages":"15836-15850"},"PeriodicalIF":8.9000,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Training a Dynamic Growing Mixture Model for Lifelong Learning\",\"authors\":\"Fei Ye;Adrian G. Bors\",\"doi\":\"10.1109/TNNLS.2025.3569156\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Lifelong learning (LLL) defines a training paradigm that aims to continuously acquire and capture new concepts from a sequence of tasks without forgetting. Recently, dynamic expansion models (DEMs) have been proposed to address catastrophic forgetting under the LLL paradigm. However, the efficiency of DEMs lacks a thorough explanation based on theoretical analysis. In this article, we develop a new theoretical framework that interprets the forgetting process of the DEM as increasing the statistical discrepancy distance between the distribution of the probabilistic representation of the new data and the previously learned knowledge. The theoretical analysis shows that adding new components to a mixture model represents a trade-off between model complexity and its performance. Inspired by the theoretical analysis, we introduce a new DEM, called the growing mixture model (GMM), where generative data components are added according to the novelty of the incoming task information compared to what is already known. A new component selection mechanism considering the model’s already acquired knowledge is employed for updating new DEM’s components, promoting efficient future task learning. We also train a compact student model with samples drawn through the generative mechanisms of the GMM, aiming to accumulate cross-domain representations over time. By employing the student model, we can significantly reduce the number of parameters and make quick inferences during the testing phase.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"36 9\",\"pages\":\"15836-15850\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-06-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11027910/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11027910/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Training a Dynamic Growing Mixture Model for Lifelong Learning
Lifelong learning (LLL) defines a training paradigm that aims to continuously acquire and capture new concepts from a sequence of tasks without forgetting. Recently, dynamic expansion models (DEMs) have been proposed to address catastrophic forgetting under the LLL paradigm. However, the efficiency of DEMs lacks a thorough explanation based on theoretical analysis. In this article, we develop a new theoretical framework that interprets the forgetting process of the DEM as increasing the statistical discrepancy distance between the distribution of the probabilistic representation of the new data and the previously learned knowledge. The theoretical analysis shows that adding new components to a mixture model represents a trade-off between model complexity and its performance. Inspired by the theoretical analysis, we introduce a new DEM, called the growing mixture model (GMM), where generative data components are added according to the novelty of the incoming task information compared to what is already known. A new component selection mechanism considering the model’s already acquired knowledge is employed for updating new DEM’s components, promoting efficient future task learning. We also train a compact student model with samples drawn through the generative mechanisms of the GMM, aiming to accumulate cross-domain representations over time. By employing the student model, we can significantly reduce the number of parameters and make quick inferences during the testing phase.
期刊介绍:
The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.