LITE-SNN：利用固有动态性训练高能效尖峰神经网络以进行序列学习

IF 5 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-03-03 DOI:10.1109/TCDS.2024.3396431

Nitin Rathi;Kaushik Roy

{"title":"LITE-SNN：利用固有动态性训练高能效尖峰神经网络以进行序列学习","authors":"Nitin Rathi;Kaushik Roy","doi":"10.1109/TCDS.2024.3396431","DOIUrl":null,"url":null,"abstract":"Spiking neural networks (SNNs) are gaining popularity for their promise of low-power machine intelligence on event-driven neuromorphic hardware. SNNs have achieved comparable performance as artificial neural networks (ANNs) on static tasks (image classification) with lower compute energy. In this work, we explore the inherent dynamics of SNNs for sequential tasks such as gesture recognition, sentiment analysis, and sequence-to-sequence learning on data from dynamic vision sensors (DVSs) and natural language processing (NLP). Sequential data are generally processed with complex recurrent neural networks (RNNs) [long short-term memory/gated recurrent unit (LSTM/GRU)] with explicit feedback connections and internal states to handle the long-term dependencies. The neuron models in SNNs—integrate-and-fire (IF) or leaky-integrate-and-fire (LIF)—have internal states (membrane potential) that can be efficiently leveraged for sequential tasks. The membrane potential in the IF/LIF neuron integrates the incoming current and outputs an event (or spike) when the potential crosses a threshold value. Since SNNs compute with highly sparse spike-based spatiotemporal data, the energy/inference is lower than LSTMs/GRUs. We also show that SNNs require fewer parameters than LSTM/GRU resulting in smaller models and faster inference. We observe the problem of vanishing gradients in vanilla SNNs for longer sequences and implement a convolutional SNN with attention layers to perform sequence-to-sequence learning tasks. The inherent recurrence in SNNs, in addition to the fully parallelized convolutional operations, provide additional mechanisms to model sequential dependencies that lead to better accuracy than convolutional neural networks (CNNs) with ReLU activations. We evaluate SNN on gesture recognition from the IBM DVS dataset, sentiment analysis from the IMDB movie reviews dataset, and German-to-English translation from the Multi30k dataset.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 6","pages":"1905-1914"},"PeriodicalIF":5.0000,"publicationDate":"2024-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LITE-SNN: Leveraging Inherent Dynamics to Train Energy-Efficient Spiking Neural Networks for Sequential Learning\",\"authors\":\"Nitin Rathi;Kaushik Roy\",\"doi\":\"10.1109/TCDS.2024.3396431\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spiking neural networks (SNNs) are gaining popularity for their promise of low-power machine intelligence on event-driven neuromorphic hardware. SNNs have achieved comparable performance as artificial neural networks (ANNs) on static tasks (image classification) with lower compute energy. In this work, we explore the inherent dynamics of SNNs for sequential tasks such as gesture recognition, sentiment analysis, and sequence-to-sequence learning on data from dynamic vision sensors (DVSs) and natural language processing (NLP). Sequential data are generally processed with complex recurrent neural networks (RNNs) [long short-term memory/gated recurrent unit (LSTM/GRU)] with explicit feedback connections and internal states to handle the long-term dependencies. The neuron models in SNNs—integrate-and-fire (IF) or leaky-integrate-and-fire (LIF)—have internal states (membrane potential) that can be efficiently leveraged for sequential tasks. The membrane potential in the IF/LIF neuron integrates the incoming current and outputs an event (or spike) when the potential crosses a threshold value. Since SNNs compute with highly sparse spike-based spatiotemporal data, the energy/inference is lower than LSTMs/GRUs. We also show that SNNs require fewer parameters than LSTM/GRU resulting in smaller models and faster inference. We observe the problem of vanishing gradients in vanilla SNNs for longer sequences and implement a convolutional SNN with attention layers to perform sequence-to-sequence learning tasks. The inherent recurrence in SNNs, in addition to the fully parallelized convolutional operations, provide additional mechanisms to model sequential dependencies that lead to better accuracy than convolutional neural networks (CNNs) with ReLU activations. We evaluate SNN on gesture recognition from the IBM DVS dataset, sentiment analysis from the IMDB movie reviews dataset, and German-to-English translation from the Multi30k dataset.\",\"PeriodicalId\":54300,\"journal\":{\"name\":\"IEEE Transactions on Cognitive and Developmental Systems\",\"volume\":\"16 6\",\"pages\":\"1905-1914\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cognitive and Developmental Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10518157/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive and Developmental Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10518157/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

脉冲神经网络（snn）因其在事件驱动的神经形态硬件上实现低功耗机器智能的前景而越来越受欢迎。snn在静态任务（图像分类）上以更低的计算能量取得了与人工神经网络相当的性能。在这项工作中，我们探索了snn在序列任务中的内在动态，如手势识别、情感分析和序列对序列学习，这些学习来自动态视觉传感器（DVSs）和自然语言处理（NLP）的数据。序列数据通常使用复杂递归神经网络（rnn）[长短期记忆/门控递归单元（LSTM/GRU）]处理，具有明确的反馈连接和内部状态来处理长期依赖关系。snns中的神经元模型-整合-激活（IF）或泄漏-整合-激活(liff) -具有内部状态（膜电位），可以有效地用于顺序任务。IF/LIF神经元中的膜电位整合输入电流，并在电位超过阈值时输出一个事件（或峰值）。由于snn使用高度稀疏的基于峰值的时空数据进行计算，因此能量/推理比LSTMs/ gru低。我们还表明，snn比LSTM/GRU需要更少的参数，从而导致更小的模型和更快的推理。我们观察了普通SNN中对于较长序列的梯度消失问题，并实现了一个带有注意层的卷积SNN来执行序列到序列的学习任务。snn固有的递归性，除了完全并行化的卷积操作之外，还提供了额外的机制来建模顺序依赖关系，从而比具有ReLU激活的卷积神经网络（cnn）具有更好的准确性。我们评估了SNN对来自IBM DVS数据集的手势识别、来自IMDB电影评论数据集的情感分析以及来自Multi30k数据集的德语到英语翻译。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

LITE-SNN: Leveraging Inherent Dynamics to Train Energy-Efficient Spiking Neural Networks for Sequential Learning

Spiking neural networks (SNNs) are gaining popularity for their promise of low-power machine intelligence on event-driven neuromorphic hardware. SNNs have achieved comparable performance as artificial neural networks (ANNs) on static tasks (image classification) with lower compute energy. In this work, we explore the inherent dynamics of SNNs for sequential tasks such as gesture recognition, sentiment analysis, and sequence-to-sequence learning on data from dynamic vision sensors (DVSs) and natural language processing (NLP). Sequential data are generally processed with complex recurrent neural networks (RNNs) [long short-term memory/gated recurrent unit (LSTM/GRU)] with explicit feedback connections and internal states to handle the long-term dependencies. The neuron models in SNNs—integrate-and-fire (IF) or leaky-integrate-and-fire (LIF)—have internal states (membrane potential) that can be efficiently leveraged for sequential tasks. The membrane potential in the IF/LIF neuron integrates the incoming current and outputs an event (or spike) when the potential crosses a threshold value. Since SNNs compute with highly sparse spike-based spatiotemporal data, the energy/inference is lower than LSTMs/GRUs. We also show that SNNs require fewer parameters than LSTM/GRU resulting in smaller models and faster inference. We observe the problem of vanishing gradients in vanilla SNNs for longer sequences and implement a convolutional SNN with attention layers to perform sequence-to-sequence learning tasks. The inherent recurrence in SNNs, in addition to the fully parallelized convolutional operations, provide additional mechanisms to model sequential dependencies that lead to better accuracy than convolutional neural networks (CNNs) with ReLU activations. We evaluate SNN on gesture recognition from the IBM DVS dataset, sentiment analysis from the IMDB movie reviews dataset, and German-to-English translation from the Multi30k dataset.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Cognitive and Developmental Systems Computer Science-Software

CiteScore

7.20

自引率

10.00%

发文量

170

期刊介绍： The IEEE Transactions on Cognitive and Developmental Systems (TCDS) focuses on advances in the study of development and cognition in natural (humans, animals) and artificial (robots, agents) systems. It welcomes contributions from multiple related disciplines including cognitive systems, cognitive robotics, developmental and epigenetic robotics, autonomous and evolutionary robotics, social structures, multi-agent and artificial life systems, computational neuroscience, and developmental psychology. Articles on theoretical, computational, application-oriented, and experimental studies as well as reviews in these areas are considered.