IM-LIF: Improved Neuronal Dynamics With Attention Mechanism for Direct Training Deep Spiking Neural Network

IF 5.3 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Emerging Topics in Computational Intelligence Pub Date : 2024-02-13 DOI:10.1109/TETCI.2024.3359539

Shuang Lian;Jiangrong Shen;Ziming Wang;Huajin Tang

{"title":"IM-LIF: Improved Neuronal Dynamics With Attention Mechanism for Direct Training Deep Spiking Neural Network","authors":"Shuang Lian;Jiangrong Shen;Ziming Wang;Huajin Tang","doi":"10.1109/TETCI.2024.3359539","DOIUrl":null,"url":null,"abstract":"Spiking neural networks (SNNs) are increasingly applied to deep architectures. Recent works are developed to apply spatio-temporal backpropagation to directly train deep SNNs. But the binary and non-differentiable properties of spike activities force directly trained SNNs to suffer from serious gradient vanishing. In this paper, we first analyze the cause of the gradient vanishing problem and identify that the gradients mostly backpropagate along the synaptic currents. Based on that, we modify the synaptic current equation of leaky-integrate-fire neuron model and propose the improved LIF (IM-LIF) neuron model on the basis of the temporal-wise attention mechanism. We utilize the temporal-wise attention mechanism to selectively establish the connection between the current and historical response values, which can empirically enable the neuronal states to update resilient to the gradient vanishing problem. Furthermore, to capture the neuronal dynamics embedded in the output incorporating the IM-LIF model, we present a new temporal loss function to constrain the output of the network close to the target distribution. The proposed new temporal loss function could not only act as a regularizer to eliminate output outliers, but also assign the network loss credit to the voltage at a specific time point. Then we modify the ResNet and VGG architecture based on the IM-LIF model to build deep SNNs. We evaluate our work on image datasets and neuromorphic datasets. Experimental results and analysis show that our method can help build deep SNNs with competitive performance in both accuracy and latency, including 95.66% on CIFAR-10, 77.42% on CIFAR-100, 55.37% on Tiny-ImageNet, 97.33% on DVS-Gesture, and 80.50% on CIFAR-DVS with very few timesteps.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":null,"pages":null},"PeriodicalIF":5.3000,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10433858/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Spiking neural networks (SNNs) are increasingly applied to deep architectures. Recent works are developed to apply spatio-temporal backpropagation to directly train deep SNNs. But the binary and non-differentiable properties of spike activities force directly trained SNNs to suffer from serious gradient vanishing. In this paper, we first analyze the cause of the gradient vanishing problem and identify that the gradients mostly backpropagate along the synaptic currents. Based on that, we modify the synaptic current equation of leaky-integrate-fire neuron model and propose the improved LIF (IM-LIF) neuron model on the basis of the temporal-wise attention mechanism. We utilize the temporal-wise attention mechanism to selectively establish the connection between the current and historical response values, which can empirically enable the neuronal states to update resilient to the gradient vanishing problem. Furthermore, to capture the neuronal dynamics embedded in the output incorporating the IM-LIF model, we present a new temporal loss function to constrain the output of the network close to the target distribution. The proposed new temporal loss function could not only act as a regularizer to eliminate output outliers, but also assign the network loss credit to the voltage at a specific time point. Then we modify the ResNet and VGG architecture based on the IM-LIF model to build deep SNNs. We evaluate our work on image datasets and neuromorphic datasets. Experimental results and analysis show that our method can help build deep SNNs with competitive performance in both accuracy and latency, including 95.66% on CIFAR-10, 77.42% on CIFAR-100, 55.37% on Tiny-ImageNet, 97.33% on DVS-Gesture, and 80.50% on CIFAR-DVS with very few timesteps.

查看原文本刊更多论文

IM-LIF：利用注意力机制改进神经元动力学，直接训练深度尖峰神经网络

尖峰神经网络（SNN）越来越多地应用于深度架构。最近的研究成果开发出了应用时空反向传播来直接训练深度 SNN 的方法。但是，由于尖峰活动的二元性和非差异性，直接训练的 SNNs 会出现严重的梯度消失。本文首先分析了梯度消失问题的原因，发现梯度主要是沿着突触电流反向传播的。在此基础上，我们修改了泄漏-整合-发射神经元模型的突触电流方程，并在时序注意机制的基础上提出了改进的 LIF（IM-LIF）神经元模型。我们利用时向注意机制选择性地建立当前响应值与历史响应值之间的联系，从而通过经验使神经元状态的更新能够抵御梯度消失问题。此外，为了捕捉包含在 IM-LIF 模型输出中的神经元动态，我们提出了一种新的时间损失函数，以限制网络输出接近目标分布。提出的新时间损失函数不仅可以作为消除输出异常值的正则化器，还可以将网络损失信用分配给特定时间点的电压。然后，我们基于 IM-LIF 模型修改了 ResNet 和 VGG 架构，以构建深度 SNN。我们在图像数据集和神经形态数据集上评估了我们的工作。实验结果和分析表明，我们的方法有助于构建深度 SNN，在准确率和延迟方面都具有竞争力，包括在 CIFAR-10 上达到 95.66%，在 CIFAR-100 上达到 77.42%，在 Tiny-ImageNet 上达到 55.37%，在 DVS-Gesture 上达到 97.33%，在 CIFAR-DVS 上达到 80.50%，而且只需很少的时间步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Emerging Topics in Computational Intelligence Mathematics-Control and Optimization

CiteScore

10.30

自引率

7.50%

发文量

147

期刊介绍： The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys. TETCI is an electronics only publication. TETCI publishes six issues per year. Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.

文献相关原料

公司名称	产品信息	采购帮参考价格