尖峰变压器的时空尖峰特征剪枝

IF 4.9 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-11-19 DOI:10.1109/TCDS.2024.3500018

Zhaokun Zhou;Kaiwei Che;Jun Niu;Man Yao;Guoqi Li;Li Yuan;Guibo Luo;Yuesheng Zhu

{"title":"尖峰变压器的时空尖峰特征剪枝","authors":"Zhaokun Zhou;Kaiwei Che;Jun Niu;Man Yao;Guoqi Li;Li Yuan;Guibo Luo;Yuesheng Zhu","doi":"10.1109/TCDS.2024.3500018","DOIUrl":null,"url":null,"abstract":"Spiking neural networks (SNNs) are known for brain-inspired architecture and low power consumption. Leveraging biocompatibility and self-attention mechanism, Spiking Transformers become the most promising SNN architecture with high accuracy. However, Spiking Transformers still faces the challenge of high training costs, such as a 51<inline-formula><tex-math>$M$</tex-math></inline-formula> network requiring 181 training hours on ImageNet. In this work, we explore feature pruning to reduce training costs and overcome two challenges: high pruning ratio and lightweight pruning methods. We first analyze the spiking features and find the potential for a high pruning ratio. The majority of information is concentrated on a part of the spiking features in spiking transformer, which suggests that we can keep this part of the tokens and prune the others. To achieve lightweight, a parameter-free spatial–temporal spiking feature pruning method is proposed, which uses only a simple addition-sorting operation. The spiking features/tokens with high spike accumulation values are selected for training. The others are pruned and merged through a compensation module called Softmatch. Experimental results demonstrate that our method reduces training costs without compromising image classification accuracy. On ImageNet, our approach reduces the training time from 181 to 128 h while achieving comparable accuracy (83.13% versus 83.07%).","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"644-658"},"PeriodicalIF":4.9000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Spatial–Temporal Spiking Feature Pruning in Spiking Transformer\",\"authors\":\"Zhaokun Zhou;Kaiwei Che;Jun Niu;Man Yao;Guoqi Li;Li Yuan;Guibo Luo;Yuesheng Zhu\",\"doi\":\"10.1109/TCDS.2024.3500018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spiking neural networks (SNNs) are known for brain-inspired architecture and low power consumption. Leveraging biocompatibility and self-attention mechanism, Spiking Transformers become the most promising SNN architecture with high accuracy. However, Spiking Transformers still faces the challenge of high training costs, such as a 51<inline-formula><tex-math>$M$</tex-math></inline-formula> network requiring 181 training hours on ImageNet. In this work, we explore feature pruning to reduce training costs and overcome two challenges: high pruning ratio and lightweight pruning methods. We first analyze the spiking features and find the potential for a high pruning ratio. The majority of information is concentrated on a part of the spiking features in spiking transformer, which suggests that we can keep this part of the tokens and prune the others. To achieve lightweight, a parameter-free spatial–temporal spiking feature pruning method is proposed, which uses only a simple addition-sorting operation. The spiking features/tokens with high spike accumulation values are selected for training. The others are pruned and merged through a compensation module called Softmatch. Experimental results demonstrate that our method reduces training costs without compromising image classification accuracy. On ImageNet, our approach reduces the training time from 181 to 128 h while achieving comparable accuracy (83.13% versus 83.07%).\",\"PeriodicalId\":54300,\"journal\":{\"name\":\"IEEE Transactions on Cognitive and Developmental Systems\",\"volume\":\"17 3\",\"pages\":\"644-658\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2024-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cognitive and Developmental Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10758407/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive and Developmental Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10758407/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

脉冲神经网络（snn）以其受大脑启发的架构和低功耗而闻名。利用生物相容性和自关注机制，尖峰变压器成为最有前途的高精度SNN架构。然而，Spiking transformer仍然面临着高培训成本的挑战，例如一个51美元的网络需要在ImageNet上进行181小时的培训。在这项工作中，我们探索特征修剪以降低训练成本，并克服两个挑战：高修剪比和轻量级修剪方法。我们首先分析了尖峰特征，并找到了高修剪比的潜力。大多数信息集中在尖峰变压器中尖峰特征的一部分，这表明我们可以保留这部分令牌，并对其他令牌进行修剪。为了实现轻量化，提出了一种无参数时空尖峰特征剪枝方法，该方法只使用简单的加法排序操作。选择具有高峰值积累值的峰值特征/标记进行训练。其他的通过一个称为Softmatch的补偿模块进行修剪和合并。实验结果表明，该方法在不影响图像分类精度的前提下降低了训练成本。在ImageNet上，我们的方法将训练时间从181小时减少到128小时，同时达到相当的准确率（83.13%对83.07%）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Spatial–Temporal Spiking Feature Pruning in Spiking Transformer

Spiking neural networks (SNNs) are known for brain-inspired architecture and low power consumption. Leveraging biocompatibility and self-attention mechanism, Spiking Transformers become the most promising SNN architecture with high accuracy. However, Spiking Transformers still faces the challenge of high training costs, such as a 51

$M$

network requiring 181 training hours on ImageNet. In this work, we explore feature pruning to reduce training costs and overcome two challenges: high pruning ratio and lightweight pruning methods. We first analyze the spiking features and find the potential for a high pruning ratio. The majority of information is concentrated on a part of the spiking features in spiking transformer, which suggests that we can keep this part of the tokens and prune the others. To achieve lightweight, a parameter-free spatial–temporal spiking feature pruning method is proposed, which uses only a simple addition-sorting operation. The spiking features/tokens with high spike accumulation values are selected for training. The others are pruned and merged through a compensation module called Softmatch. Experimental results demonstrate that our method reduces training costs without compromising image classification accuracy. On ImageNet, our approach reduces the training time from 181 to 128 h while achieving comparable accuracy (83.13% versus 83.07%).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Cognitive and Developmental Systems Computer Science-Software

CiteScore

7.20

自引率

10.00%

发文量

170

期刊介绍： The IEEE Transactions on Cognitive and Developmental Systems (TCDS) focuses on advances in the study of development and cognition in natural (humans, animals) and artificial (robots, agents) systems. It welcomes contributions from multiple related disciplines including cognitive systems, cognitive robotics, developmental and epigenetic robotics, autonomous and evolutionary robotics, social structures, multi-agent and artificial life systems, computational neuroscience, and developmental psychology. Articles on theoretical, computational, application-oriented, and experimental studies as well as reviews in these areas are considered.