{"title":"Advancing Neuromorphic Architecture Toward Emerging Spiking Neural Network on FPGA","authors":"Yingxue Gao;Teng Wang;Yang Yang;Lei Gong;Xianglan Chen;Chao Wang;Xi Li;Xuehai Zhou","doi":"10.1109/TCAD.2025.3547275","DOIUrl":null,"url":null,"abstract":"Spiking neural networks (SNNs) replace the multiply-and-accumulate operations in traditional artificial neural networks (ANNs) with lightweight mask-and-accumulate operations, achieving greater performance. Existing SNN architectures are primarily designed based on fully-connected or convolutional SNN topologies and still struggle with low task accuracy, limiting their practical applications. Recently, transformer SNN (TSNN) models have shown promise in matching the accuracy of nonspiking ANNs and demonstrated potential application prospects. However, their diverse computation pattern and sophisticated network structure with high computation and memory footprints impede their efficient deployment. Thus, in this work, we move our attention to heterogeneous architecture design and propose SpikeTA, the first neuromorphic hardware accelerator explicitly designed for the TSNN model on FPGA. First, SpikeTA enables parameterizable hardware engines (HEs) designed for the network layers in TSNN, enhancing compatibility between HEs and network layers. Second, SpikeTA optimizes arithmetic operations between binary spikes and synaptic weights by presenting a DSP-efficient addition tree. By analyzing the inherent data characteristics, SpikeTA further introduces a depth-aware buffer management strategy to provide sufficient access ports. Third, SpikeTA employs a streaming dataflow mapping to optimize data transmission granularity and leverages a split-engine dataflow mapping to facilitate pipelined latency balancing. Experimental results demonstrate that SpikeTA achieves significant performance speedups of <inline-formula> <tex-math>$140.73\\times $ </tex-math></inline-formula>–<inline-formula> <tex-math>$1023.53\\times $ </tex-math></inline-formula> and <inline-formula> <tex-math>$2.97\\times $ </tex-math></inline-formula>–<inline-formula> <tex-math>$7.29\\times $ </tex-math></inline-formula> over architectures running on the AMD EPYC 7542 CPU and NVIDIA A100 GPU, respectively. SpikeTA also outperforms state-of-the-art SNN and Transformer accelerators by <inline-formula> <tex-math>$2.79\\times $ </tex-math></inline-formula> and <inline-formula> <tex-math>$2.66\\times $ </tex-math></inline-formula> in architecture performance while achieving a peak performance of 28.99 TOPs.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 9","pages":"3465-3478"},"PeriodicalIF":2.9000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10908632/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Spiking neural networks (SNNs) replace the multiply-and-accumulate operations in traditional artificial neural networks (ANNs) with lightweight mask-and-accumulate operations, achieving greater performance. Existing SNN architectures are primarily designed based on fully-connected or convolutional SNN topologies and still struggle with low task accuracy, limiting their practical applications. Recently, transformer SNN (TSNN) models have shown promise in matching the accuracy of nonspiking ANNs and demonstrated potential application prospects. However, their diverse computation pattern and sophisticated network structure with high computation and memory footprints impede their efficient deployment. Thus, in this work, we move our attention to heterogeneous architecture design and propose SpikeTA, the first neuromorphic hardware accelerator explicitly designed for the TSNN model on FPGA. First, SpikeTA enables parameterizable hardware engines (HEs) designed for the network layers in TSNN, enhancing compatibility between HEs and network layers. Second, SpikeTA optimizes arithmetic operations between binary spikes and synaptic weights by presenting a DSP-efficient addition tree. By analyzing the inherent data characteristics, SpikeTA further introduces a depth-aware buffer management strategy to provide sufficient access ports. Third, SpikeTA employs a streaming dataflow mapping to optimize data transmission granularity and leverages a split-engine dataflow mapping to facilitate pipelined latency balancing. Experimental results demonstrate that SpikeTA achieves significant performance speedups of $140.73\times $ –$1023.53\times $ and $2.97\times $ –$7.29\times $ over architectures running on the AMD EPYC 7542 CPU and NVIDIA A100 GPU, respectively. SpikeTA also outperforms state-of-the-art SNN and Transformer accelerators by $2.79\times $ and $2.66\times $ in architecture performance while achieving a peak performance of 28.99 TOPs.
期刊介绍:
The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.