NEXUS: A 28nm 3.3pJ/SOP 16-Core Spiking Neural Network with a Diamond Topology for Real-Time Data Processing.

Maryam Sadeghi, Yasser Rezaeiyan, Dario Fernandez Khatiboun, Sherif Eissa, Federico Corradi, Charles Augustine, Farshad Moradi
{"title":"NEXUS: A 28nm 3.3pJ/SOP 16-Core Spiking Neural Network with a Diamond Topology for Real-Time Data Processing.","authors":"Maryam Sadeghi, Yasser Rezaeiyan, Dario Fernandez Khatiboun, Sherif Eissa, Federico Corradi, Charles Augustine, Farshad Moradi","doi":"10.1109/TBCAS.2024.3452635","DOIUrl":null,"url":null,"abstract":"<p><p>The realization of brain-scale spiking neural networks (SNNs) is impeded by power constraints and low integration density. To address these challenges, multi-core SNNs are utilized to emulate numerous neurons with high energy efficiency, where spike packets are routed through a network-on-chip (NoC). However, the information can be lost in the NoC under high spike traffic conditions, leading to performance degradation. This work presents NEXUS, a 16-core SNN with a diamond-shaped NoC topology fabricated in 28-nm CMOS technology. It integrates 4096 leaky integrate-and-fire (LIF) neurons with 1M 4-bit synaptic weights, occupying an area of 2.16 mm2. The proposed NoC architecture is scalable to any network size, ensuring no data loss due to contending packets with a maximum routing latency of 5.1μs for 16 cores. The proposed congestion management method eliminates the need for FIFO in routers, resulting in a compact router footprint of 0.001 mm<sup>2</sup>. The proposed neurosynaptic core allows for increasing the processing speed by up to 8.5× depending on input sparsity. The SNN achieves a peak throughput of 4.7 GSOP/s at 0.9 V, consuming a minimum energy per synaptic operation (SOP) of 3.3 pJ at 0.55 V. A 4-layer feed-forward network is mapped onto the chip, classifying MNIST digits with 92.3% accuracy at 8.4Kclassification/ s and consuming 2.7-μJ/classification. Additionally, an audio recognition task mapped onto the chip achieves 87.4% accuracy at 215-μJ/classification.</p>","PeriodicalId":94031,"journal":{"name":"IEEE transactions on biomedical circuits and systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biomedical circuits and systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TBCAS.2024.3452635","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The realization of brain-scale spiking neural networks (SNNs) is impeded by power constraints and low integration density. To address these challenges, multi-core SNNs are utilized to emulate numerous neurons with high energy efficiency, where spike packets are routed through a network-on-chip (NoC). However, the information can be lost in the NoC under high spike traffic conditions, leading to performance degradation. This work presents NEXUS, a 16-core SNN with a diamond-shaped NoC topology fabricated in 28-nm CMOS technology. It integrates 4096 leaky integrate-and-fire (LIF) neurons with 1M 4-bit synaptic weights, occupying an area of 2.16 mm2. The proposed NoC architecture is scalable to any network size, ensuring no data loss due to contending packets with a maximum routing latency of 5.1μs for 16 cores. The proposed congestion management method eliminates the need for FIFO in routers, resulting in a compact router footprint of 0.001 mm2. The proposed neurosynaptic core allows for increasing the processing speed by up to 8.5× depending on input sparsity. The SNN achieves a peak throughput of 4.7 GSOP/s at 0.9 V, consuming a minimum energy per synaptic operation (SOP) of 3.3 pJ at 0.55 V. A 4-layer feed-forward network is mapped onto the chip, classifying MNIST digits with 92.3% accuracy at 8.4Kclassification/ s and consuming 2.7-μJ/classification. Additionally, an audio recognition task mapped onto the chip achieves 87.4% accuracy at 215-μJ/classification.

NEXUS:用于实时数据处理的 28 纳米 3.3pJ/SOP 16 核钻石拓扑尖峰神经网络。
功率限制和低集成密度阻碍了大脑级尖峰神经网络(SNN)的实现。为了应对这些挑战,多核 SNNs 被用来以高能量效率模拟大量神经元,其中尖峰数据包通过片上网络(NoC)路由。然而,在高尖峰流量条件下,信息可能会在 NoC 中丢失,从而导致性能下降。本文介绍的 NEXUS 是一种 16 核 SNN,采用 28 纳米 CMOS 技术制造,具有菱形 NoC 拓扑。它集成了 4096 个具有 100 万个 4 位突触权重的泄漏积分发射(LIF)神经元,占地面积为 2.16 平方毫米。所提出的 NoC 架构可扩展至任何网络规模,在 16 个内核的最大路由延迟为 5.1μs 的情况下,确保不会因数据包竞争而导致数据丢失。所提出的拥塞管理方法无需在路由器中使用先进先出(FIFO),因此路由器占地面积仅为 0.001 平方毫米。拟议的神经突触内核可将处理速度提高 8.5 倍,具体取决于输入的稀疏程度。SNN 在 0.9 V 电压下的峰值吞吐量为 4.7 GSOP/s,在 0.55 V 电压下每次突触操作 (SOP) 的最低能耗为 3.3 pJ。在芯片上映射了一个 4 层前馈网络,以 8.4Kclassification/ s 的速度对 MNIST 数字进行分类,准确率达 92.3%,每分类消耗 2.7-μJ 能量。此外,映射到芯片上的音频识别任务以 215-μJ/classification 的速度达到了 87.4% 的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信