{"title":"具有多结构兼容性的高效尖峰卷积神经网络加速器。","authors":"Jiadong Wu, Lun Lu, Yinan Wang, Zhiwei Li, Changlin Chen, Qingjiang Li, Kairang Chen","doi":"10.3389/fnins.2025.1662886","DOIUrl":null,"url":null,"abstract":"<p><p>Spiking Neural Networks (SNNs) possess excellent computational energy efficiency and biological credibility. Among them, Spiking Convolutional Neural Networks (SCNNs) have significantly improved performance, demonstrating promising applications in low-power and brain-like computing. To achieve hardware acceleration for SCNNs, we propose an efficient FPGA accelerator architecture with multi-structure compatibility. This architecture supports both traditional convolutional and residual topologies, and can be adapted to diverse requirements from small networks to complex networks. This architecture uses a clock-driven scheme to perform convolution and neuron updates based on the spike-encoded image at each timestep. Through hierarchical pipelining and channel parallelization strategies, the computation speed of SCNNs is increased. To address the issue of current accelerators only supporting simple network, this architecture combines configuration and scheduling methods, including grouped reuse computation and line-by-line multi-timestep computation to accelerate deep networks with lots of channels and large feature map sizes. Based on the proposed accelerator architecture, we evaluated two scales of networks, named small-scale LeNet and deep residual SCNN, for object detection. Experiments show that the proposed accelerator achieves a maximum recognition speed of 1, 605 frames/s at a 100 MHz clock for the LeNet network, consuming only 0.65 mJ per image. Furthermore, the accelerator, combined with the proposed configuration and scheduling methods, achieves acceleration for each residual module in the deep residual SCNN, reaching a processing speed of 2.59 times that of the CPU with a power consumption of only 16.77% of the CPU. This demonstrates that the proposed accelerator architecture can achieve higher energy efficiency, compatibility, and wider applicability.</p>","PeriodicalId":12639,"journal":{"name":"Frontiers in Neuroscience","volume":"19 ","pages":"1662886"},"PeriodicalIF":3.2000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12511070/pdf/","citationCount":"0","resultStr":"{\"title\":\"Efficient spiking convolutional neural networks accelerator with multi-structure compatibility.\",\"authors\":\"Jiadong Wu, Lun Lu, Yinan Wang, Zhiwei Li, Changlin Chen, Qingjiang Li, Kairang Chen\",\"doi\":\"10.3389/fnins.2025.1662886\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Spiking Neural Networks (SNNs) possess excellent computational energy efficiency and biological credibility. Among them, Spiking Convolutional Neural Networks (SCNNs) have significantly improved performance, demonstrating promising applications in low-power and brain-like computing. To achieve hardware acceleration for SCNNs, we propose an efficient FPGA accelerator architecture with multi-structure compatibility. This architecture supports both traditional convolutional and residual topologies, and can be adapted to diverse requirements from small networks to complex networks. This architecture uses a clock-driven scheme to perform convolution and neuron updates based on the spike-encoded image at each timestep. Through hierarchical pipelining and channel parallelization strategies, the computation speed of SCNNs is increased. To address the issue of current accelerators only supporting simple network, this architecture combines configuration and scheduling methods, including grouped reuse computation and line-by-line multi-timestep computation to accelerate deep networks with lots of channels and large feature map sizes. Based on the proposed accelerator architecture, we evaluated two scales of networks, named small-scale LeNet and deep residual SCNN, for object detection. Experiments show that the proposed accelerator achieves a maximum recognition speed of 1, 605 frames/s at a 100 MHz clock for the LeNet network, consuming only 0.65 mJ per image. Furthermore, the accelerator, combined with the proposed configuration and scheduling methods, achieves acceleration for each residual module in the deep residual SCNN, reaching a processing speed of 2.59 times that of the CPU with a power consumption of only 16.77% of the CPU. This demonstrates that the proposed accelerator architecture can achieve higher energy efficiency, compatibility, and wider applicability.</p>\",\"PeriodicalId\":12639,\"journal\":{\"name\":\"Frontiers in Neuroscience\",\"volume\":\"19 \",\"pages\":\"1662886\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12511070/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Neuroscience\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3389/fnins.2025.1662886\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fnins.2025.1662886","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
Efficient spiking convolutional neural networks accelerator with multi-structure compatibility.
Spiking Neural Networks (SNNs) possess excellent computational energy efficiency and biological credibility. Among them, Spiking Convolutional Neural Networks (SCNNs) have significantly improved performance, demonstrating promising applications in low-power and brain-like computing. To achieve hardware acceleration for SCNNs, we propose an efficient FPGA accelerator architecture with multi-structure compatibility. This architecture supports both traditional convolutional and residual topologies, and can be adapted to diverse requirements from small networks to complex networks. This architecture uses a clock-driven scheme to perform convolution and neuron updates based on the spike-encoded image at each timestep. Through hierarchical pipelining and channel parallelization strategies, the computation speed of SCNNs is increased. To address the issue of current accelerators only supporting simple network, this architecture combines configuration and scheduling methods, including grouped reuse computation and line-by-line multi-timestep computation to accelerate deep networks with lots of channels and large feature map sizes. Based on the proposed accelerator architecture, we evaluated two scales of networks, named small-scale LeNet and deep residual SCNN, for object detection. Experiments show that the proposed accelerator achieves a maximum recognition speed of 1, 605 frames/s at a 100 MHz clock for the LeNet network, consuming only 0.65 mJ per image. Furthermore, the accelerator, combined with the proposed configuration and scheduling methods, achieves acceleration for each residual module in the deep residual SCNN, reaching a processing speed of 2.59 times that of the CPU with a power consumption of only 16.77% of the CPU. This demonstrates that the proposed accelerator architecture can achieve higher energy efficiency, compatibility, and wider applicability.
期刊介绍:
Neural Technology is devoted to the convergence between neurobiology and quantum-, nano- and micro-sciences. In our vision, this interdisciplinary approach should go beyond the technological development of sophisticated methods and should contribute in generating a genuine change in our discipline.