Marmotini：采用混合压缩方法的尖峰神经网络权重密度自适应架构

IF 2.8 2区工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-09-18 DOI:10.1109/TVLSI.2024.3453897

Zilin Wang;Yi Zhong;Zehong Ou;Youming Yang;Shuo Feng;Guang Chen;Xiaoxin Cui;Song Jia;Yuan Wang

{"title":"Marmotini：采用混合压缩方法的尖峰神经网络权重密度自适应架构","authors":"Zilin Wang;Yi Zhong;Zehong Ou;Youming Yang;Shuo Feng;Guang Chen;Xiaoxin Cui;Song Jia;Yuan Wang","doi":"10.1109/TVLSI.2024.3453897","DOIUrl":null,"url":null,"abstract":"Brain-inspired spiking neural network (SNN) has recently attracted widespread interest owing to its event-driven nature and relatively low-power hardware for transmitting highly sparse binary spikes. To further improve energy efficiency, some matrix compression algorithms are used for weight storage. However, the weight sparsity of different layers varies greatly. For a multicore neuromorphic system, it is difficult for the same compression algorithm to adapt to all the layers of SNN model. In this work, we propose a weight density adaptation architecture with hybrid compression method for SNN, named Marmotini. It is a multicore heterogeneous design, including three types of cores to complete computation of different weight sparsity. Benefiting from the hybrid compression method, Marmotini minimizes the waste of neurons and weights as much as possible. Besides, for better flexibility, a reconfigurable core that can be configured to compute convolutional layer or fully connected layer is proposed. Implemented on Xilinx Kintex UltraScale XCKU115 field-programmable gate array (FPGA) board, Marmotini can operate at 150-MHz frequency, achieving 244.6-GSOP/s peak performance and 54.1-GSOP/W energy efficiency at 0% spike sparsity.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 12","pages":"2293-2302"},"PeriodicalIF":2.8000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Marmotini: A Weight Density Adaptation Architecture With Hybrid Compression Method for Spiking Neural Network\",\"authors\":\"Zilin Wang;Yi Zhong;Zehong Ou;Youming Yang;Shuo Feng;Guang Chen;Xiaoxin Cui;Song Jia;Yuan Wang\",\"doi\":\"10.1109/TVLSI.2024.3453897\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Brain-inspired spiking neural network (SNN) has recently attracted widespread interest owing to its event-driven nature and relatively low-power hardware for transmitting highly sparse binary spikes. To further improve energy efficiency, some matrix compression algorithms are used for weight storage. However, the weight sparsity of different layers varies greatly. For a multicore neuromorphic system, it is difficult for the same compression algorithm to adapt to all the layers of SNN model. In this work, we propose a weight density adaptation architecture with hybrid compression method for SNN, named Marmotini. It is a multicore heterogeneous design, including three types of cores to complete computation of different weight sparsity. Benefiting from the hybrid compression method, Marmotini minimizes the waste of neurons and weights as much as possible. Besides, for better flexibility, a reconfigurable core that can be configured to compute convolutional layer or fully connected layer is proposed. Implemented on Xilinx Kintex UltraScale XCKU115 field-programmable gate array (FPGA) board, Marmotini can operate at 150-MHz frequency, achieving 244.6-GSOP/s peak performance and 54.1-GSOP/W energy efficiency at 0% spike sparsity.\",\"PeriodicalId\":13425,\"journal\":{\"name\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"volume\":\"32 12\",\"pages\":\"2293-2302\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10682801/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10682801/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

由于其事件驱动的特性和用于传输高度稀疏二进制尖峰的相对低功耗硬件，大脑激发的尖峰神经网络（SNN）最近引起了广泛的兴趣。为了进一步提高能量效率，一些矩阵压缩算法被用于权重存储。然而，不同层的权值稀疏度差异很大。对于一个多核神经形态系统，同一种压缩算法很难适应SNN模型的所有层。在这项工作中，我们提出了一种具有混合压缩方法的SNN权密度自适应架构，称为Marmotini。它是一种多核异构设计，包括三种核来完成不同权值稀疏度的计算。得益于混合压缩方法，Marmotini尽可能地减少了神经元和权重的浪费。此外，为了获得更好的灵活性，提出了可配置的可重构核，该核可配置为计算卷积层或全连接层。在Xilinx Kintex UltraScale XCKU115现场可编程门阵列（FPGA）板上实现，Marmotini可以在150 mhz频率下工作，峰值性能为244.6 gsop /s，峰值密度为0%时的能效为54.1 gsop /W。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Marmotini: A Weight Density Adaptation Architecture With Hybrid Compression Method for Spiking Neural Network

Brain-inspired spiking neural network (SNN) has recently attracted widespread interest owing to its event-driven nature and relatively low-power hardware for transmitting highly sparse binary spikes. To further improve energy efficiency, some matrix compression algorithms are used for weight storage. However, the weight sparsity of different layers varies greatly. For a multicore neuromorphic system, it is difficult for the same compression algorithm to adapt to all the layers of SNN model. In this work, we propose a weight density adaptation architecture with hybrid compression method for SNN, named Marmotini. It is a multicore heterogeneous design, including three types of cores to complete computation of different weight sparsity. Benefiting from the hybrid compression method, Marmotini minimizes the waste of neurons and weights as much as possible. Besides, for better flexibility, a reconfigurable core that can be configured to compute convolutional layer or fully connected layer is proposed. Implemented on Xilinx Kintex UltraScale XCKU115 field-programmable gate array (FPGA) board, Marmotini can operate at 150-MHz frequency, achieving 244.6-GSOP/s peak performance and 54.1-GSOP/W energy efficiency at 0% spike sparsity.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Very Large Scale Integration (VLSI) Systems 工程技术-工程：电子与电气

CiteScore

6.40

自引率

7.10%

发文量

187

审稿时长

3.6 months

期刊介绍： The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society. Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels. To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.