{"title":"Xpikeformer:用于脉冲变压器的混合模拟-数字硬件加速","authors":"Zihang Song;Prabodh Katti;Osvaldo Simeone;Bipin Rajendran","doi":"10.1109/TVLSI.2025.3552534","DOIUrl":null,"url":null,"abstract":"The integration of neuromorphic computing and transformers through spiking neural networks (SNNs) offers a promising path to energy-efficient sequence modeling, with the potential to overcome the energy-intensive nature of the artificial neural network (ANN)-based transformers. However, the algorithmic efficiency of SNN-based transformers cannot be fully exploited on GPUs due to architectural incompatibility. This article introduces Xpikeformer, a hybrid analog-digital hardware architecture designed to accelerate SNN-based transformer models. The architecture integrates analog in-memory computing (AIMC) for feedforward and fully connected layers, and a stochastic spiking attention (SSA) engine for efficient attention mechanisms. We detail the design, implementation, and evaluation of Xpikeformer, demonstrating significant improvements in energy consumption and computational efficiency. Through image classification tasks and wireless communication symbol detection tasks, we show that Xpikeformer can achieve inference accuracy comparable to the GPU implementation of ANN-based transformers. Evaluations reveal that Xpikeformer achieves a <inline-formula> <tex-math>$13\\times $ </tex-math></inline-formula> reduction in energy consumption at approximately the same throughput as the state-of-the-art (SOTA) digital accelerator for ANN-based transformers. In addition, Xpikeformer achieves up to <inline-formula> <tex-math>$1.9\\times $ </tex-math></inline-formula> energy reduction compared to the optimal digital ASIC projection of SOTA SNN-based transformers.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1596-1609"},"PeriodicalIF":2.8000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Xpikeformer: Hybrid Analog-Digital Hardware Acceleration for Spiking Transformers\",\"authors\":\"Zihang Song;Prabodh Katti;Osvaldo Simeone;Bipin Rajendran\",\"doi\":\"10.1109/TVLSI.2025.3552534\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The integration of neuromorphic computing and transformers through spiking neural networks (SNNs) offers a promising path to energy-efficient sequence modeling, with the potential to overcome the energy-intensive nature of the artificial neural network (ANN)-based transformers. However, the algorithmic efficiency of SNN-based transformers cannot be fully exploited on GPUs due to architectural incompatibility. This article introduces Xpikeformer, a hybrid analog-digital hardware architecture designed to accelerate SNN-based transformer models. The architecture integrates analog in-memory computing (AIMC) for feedforward and fully connected layers, and a stochastic spiking attention (SSA) engine for efficient attention mechanisms. We detail the design, implementation, and evaluation of Xpikeformer, demonstrating significant improvements in energy consumption and computational efficiency. Through image classification tasks and wireless communication symbol detection tasks, we show that Xpikeformer can achieve inference accuracy comparable to the GPU implementation of ANN-based transformers. Evaluations reveal that Xpikeformer achieves a <inline-formula> <tex-math>$13\\\\times $ </tex-math></inline-formula> reduction in energy consumption at approximately the same throughput as the state-of-the-art (SOTA) digital accelerator for ANN-based transformers. In addition, Xpikeformer achieves up to <inline-formula> <tex-math>$1.9\\\\times $ </tex-math></inline-formula> energy reduction compared to the optimal digital ASIC projection of SOTA SNN-based transformers.\",\"PeriodicalId\":13425,\"journal\":{\"name\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"volume\":\"33 6\",\"pages\":\"1596-1609\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10957839/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10957839/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Xpikeformer: Hybrid Analog-Digital Hardware Acceleration for Spiking Transformers
The integration of neuromorphic computing and transformers through spiking neural networks (SNNs) offers a promising path to energy-efficient sequence modeling, with the potential to overcome the energy-intensive nature of the artificial neural network (ANN)-based transformers. However, the algorithmic efficiency of SNN-based transformers cannot be fully exploited on GPUs due to architectural incompatibility. This article introduces Xpikeformer, a hybrid analog-digital hardware architecture designed to accelerate SNN-based transformer models. The architecture integrates analog in-memory computing (AIMC) for feedforward and fully connected layers, and a stochastic spiking attention (SSA) engine for efficient attention mechanisms. We detail the design, implementation, and evaluation of Xpikeformer, demonstrating significant improvements in energy consumption and computational efficiency. Through image classification tasks and wireless communication symbol detection tasks, we show that Xpikeformer can achieve inference accuracy comparable to the GPU implementation of ANN-based transformers. Evaluations reveal that Xpikeformer achieves a $13\times $ reduction in energy consumption at approximately the same throughput as the state-of-the-art (SOTA) digital accelerator for ANN-based transformers. In addition, Xpikeformer achieves up to $1.9\times $ energy reduction compared to the optimal digital ASIC projection of SOTA SNN-based transformers.
期刊介绍:
The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society.
Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels.
To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.