Brian S. Robinson, Clare W. Lau, Alexander New, S. Nichols, Erik C. Johnson, M. Wolmetz, W. Coon
{"title":"Continual learning benefits from multiple sleep stages: NREM, REM, and Synaptic Downscaling","authors":"Brian S. Robinson, Clare W. Lau, Alexander New, S. Nichols, Erik C. Johnson, M. Wolmetz, W. Coon","doi":"10.1109/IJCNN55064.2022.9891965","DOIUrl":null,"url":null,"abstract":"Learning new tasks and skills in succession without overwriting or interfering with prior learning (i.e., “catastrophic forgetting”) is a computational challenge for both artificial and biological neural networks, yet artificial systems struggle to achieve even rudimentary parity with the performance and functionality apparent in biology. One of the processes found in biology that can be adapted for use in artificial systems is sleep, in which the brain deploys numerous neural operations relevant to continual learning and ripe for artificial adaptation. Here, we investigate how modeling three distinct components of mammalian sleep together affects continual learning in artificial neural networks: (1) a veridical memory replay process observed during non-rapid eye movement (NREM) sleep; (2) a generative memory replay process linked to REM sleep; and (3) a synaptic downscaling process which has been proposed to tune signal-to-noise ratios and support neural upkeep. To create this tripartite artificial sleep, we modeled NREM veridical replay by training the network using intermediate representations of samples from the current task. We modeled REM by utilizing a generator network to create intermediate representations of samples from previous tasks for training. Synaptic downscaling, a novel con-tribution, is modeled utilizing a size-dependent downscaling of network weights. We find benefits from the inclusion of all three sleep components when evaluating performance on a continual learning CIFAR-100 image classification benchmark. Maximum accuracy improved during training and catastrophic forgetting was reduced during later tasks. While some catastrophic forget-ting persisted over the course of network training, higher levels of synaptic downscaling lead to better retention of early tasks and further facilitated the recovery of early task accuracy during subsequent training. One key takeaway is that there is a trade-off at hand when considering the level of synaptic downscaling to use - more aggressive downscaling better protects early tasks, but less downscaling enhances the ability to learn new tasks. Intermediate levels can strike a balance with the highest overall accuracies during training. Overall, our results both provide insight into how to adapt sleep components to enhance artificial continual learning systems and highlight areas for future neuroscientific sleep research to further such systems.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN55064.2022.9891965","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Learning new tasks and skills in succession without overwriting or interfering with prior learning (i.e., “catastrophic forgetting”) is a computational challenge for both artificial and biological neural networks, yet artificial systems struggle to achieve even rudimentary parity with the performance and functionality apparent in biology. One of the processes found in biology that can be adapted for use in artificial systems is sleep, in which the brain deploys numerous neural operations relevant to continual learning and ripe for artificial adaptation. Here, we investigate how modeling three distinct components of mammalian sleep together affects continual learning in artificial neural networks: (1) a veridical memory replay process observed during non-rapid eye movement (NREM) sleep; (2) a generative memory replay process linked to REM sleep; and (3) a synaptic downscaling process which has been proposed to tune signal-to-noise ratios and support neural upkeep. To create this tripartite artificial sleep, we modeled NREM veridical replay by training the network using intermediate representations of samples from the current task. We modeled REM by utilizing a generator network to create intermediate representations of samples from previous tasks for training. Synaptic downscaling, a novel con-tribution, is modeled utilizing a size-dependent downscaling of network weights. We find benefits from the inclusion of all three sleep components when evaluating performance on a continual learning CIFAR-100 image classification benchmark. Maximum accuracy improved during training and catastrophic forgetting was reduced during later tasks. While some catastrophic forget-ting persisted over the course of network training, higher levels of synaptic downscaling lead to better retention of early tasks and further facilitated the recovery of early task accuracy during subsequent training. One key takeaway is that there is a trade-off at hand when considering the level of synaptic downscaling to use - more aggressive downscaling better protects early tasks, but less downscaling enhances the ability to learn new tasks. Intermediate levels can strike a balance with the highest overall accuracies during training. Overall, our results both provide insight into how to adapt sleep components to enhance artificial continual learning systems and highlight areas for future neuroscientific sleep research to further such systems.