{"title":"SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models","authors":"Shuaijie Shen, Chao Wang, Renzhuo Huang, Yan Zhong, Qinghai Guo, Zhichao Lu, Jianguo Zhang, Luziwei Leng","doi":"arxiv-2408.14909","DOIUrl":null,"url":null,"abstract":"Known as low energy consumption networks, spiking neural networks (SNNs) have\ngained a lot of attention within the past decades. While SNNs are increasing\ncompetitive with artificial neural networks (ANNs) for vision tasks, they are\nrarely used for long sequence tasks, despite their intrinsic temporal dynamics.\nIn this work, we develop spiking state space models (SpikingSSMs) for long\nsequence learning by leveraging on the sequence learning abilities of state\nspace models (SSMs). Inspired by dendritic neuron structure, we hierarchically\nintegrate neuronal dynamics with the original SSM block, meanwhile realizing\nsparse synaptic computation. Furthermore, to solve the conflict of event-driven\nneuronal dynamics with parallel computing, we propose a light-weight surrogate\ndynamic network which accurately predicts the after-reset membrane potential\nand compatible to learnable thresholds, enabling orders of acceleration in\ntraining speed compared with conventional iterative methods. On the long range\narena benchmark task, SpikingSSM achieves competitive performance to\nstate-of-the-art SSMs meanwhile realizing on average 90\\% of network sparsity.\nOn language modeling, our network significantly surpasses existing spiking\nlarge language models (spikingLLMs) on the WikiText-103 dataset with only a\nthird of the model size, demonstrating its potential as backbone architecture\nfor low computation cost LLMs.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"60 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Neural and Evolutionary Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.14909","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Known as low energy consumption networks, spiking neural networks (SNNs) have
gained a lot of attention within the past decades. While SNNs are increasing
competitive with artificial neural networks (ANNs) for vision tasks, they are
rarely used for long sequence tasks, despite their intrinsic temporal dynamics.
In this work, we develop spiking state space models (SpikingSSMs) for long
sequence learning by leveraging on the sequence learning abilities of state
space models (SSMs). Inspired by dendritic neuron structure, we hierarchically
integrate neuronal dynamics with the original SSM block, meanwhile realizing
sparse synaptic computation. Furthermore, to solve the conflict of event-driven
neuronal dynamics with parallel computing, we propose a light-weight surrogate
dynamic network which accurately predicts the after-reset membrane potential
and compatible to learnable thresholds, enabling orders of acceleration in
training speed compared with conventional iterative methods. On the long range
arena benchmark task, SpikingSSM achieves competitive performance to
state-of-the-art SSMs meanwhile realizing on average 90\% of network sparsity.
On language modeling, our network significantly surpasses existing spiking
large language models (spikingLLMs) on the WikiText-103 dataset with only a
third of the model size, demonstrating its potential as backbone architecture
for low computation cost LLMs.