{"title":"NEXUS:用于实时数据处理的 28 纳米 3.3pJ/SOP 16 核钻石拓扑尖峰神经网络。","authors":"Maryam Sadeghi;Yasser Rezaeiyan;Dario Fernandez Khatiboun;Sherif Eissa;Federico Corradi;Charles Augustine;Farshad Moradi","doi":"10.1109/TBCAS.2024.3452635","DOIUrl":null,"url":null,"abstract":"The realization of brain-scale spiking neural networks (SNNs) is impeded by power constraints and low integration density. To address these challenges, multi-core SNNs are utilized to emulate numerous neurons with high energy efficiency, where spike packets are routed through a network-on-chip (NoC). However, the information can be lost in the NoC under high spike traffic conditions, leading to performance degradation. This work presents NEXUS, a 16-core SNN with a diamond-shaped NoC topology fabricated in 28-nm CMOS technology. It integrates 4096 leaky integrate-and-fire (LIF) neurons with 1M 4-bit synaptic weights, occupying an area of 2.16 mm<sup>2</sup>. The proposed NoC architecture is scalable to any network size, ensuring no data loss due to contending packets with a maximum routing latency of 5.1<inline-formula><tex-math>$\\mu$</tex-math></inline-formula>s for 16 cores. The proposed congestion management method eliminates the need for FIFO in routers, resulting in a compact router footprint of 0.001 mm<sup>2</sup>. The proposed neurosynaptic core allows for increasing the processing speed by up to 8.5<inline-formula><tex-math>$\\times$</tex-math></inline-formula> depending on input sparsity. The SNN achieves a peak throughput of 4.7 GSOP/s at 0.9 V, consuming a minimum energy per synaptic operation (SOP) of 3.3 pJ at 0.55 V. A 4-layer feed-forward network is mapped onto the chip, classifying MNIST digits with 92.3% accuracy at 8.4K-classification/s and consuming 2.7-<inline-formula><tex-math>$\\mu$</tex-math></inline-formula>J/classification. Additionally, an audio recognition task mapped onto the chip achieves 87.4% accuracy at 215-<inline-formula><tex-math>$\\mu$</tex-math></inline-formula>J/classification.","PeriodicalId":94031,"journal":{"name":"IEEE transactions on biomedical circuits and systems","volume":"19 3","pages":"523-535"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"NEXUS: A 28nm 3.3pJ/SOP 16-Core Spiking Neural Network With a Diamond Topology for Real-Time Data Processing\",\"authors\":\"Maryam Sadeghi;Yasser Rezaeiyan;Dario Fernandez Khatiboun;Sherif Eissa;Federico Corradi;Charles Augustine;Farshad Moradi\",\"doi\":\"10.1109/TBCAS.2024.3452635\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The realization of brain-scale spiking neural networks (SNNs) is impeded by power constraints and low integration density. To address these challenges, multi-core SNNs are utilized to emulate numerous neurons with high energy efficiency, where spike packets are routed through a network-on-chip (NoC). However, the information can be lost in the NoC under high spike traffic conditions, leading to performance degradation. This work presents NEXUS, a 16-core SNN with a diamond-shaped NoC topology fabricated in 28-nm CMOS technology. It integrates 4096 leaky integrate-and-fire (LIF) neurons with 1M 4-bit synaptic weights, occupying an area of 2.16 mm<sup>2</sup>. The proposed NoC architecture is scalable to any network size, ensuring no data loss due to contending packets with a maximum routing latency of 5.1<inline-formula><tex-math>$\\\\mu$</tex-math></inline-formula>s for 16 cores. The proposed congestion management method eliminates the need for FIFO in routers, resulting in a compact router footprint of 0.001 mm<sup>2</sup>. The proposed neurosynaptic core allows for increasing the processing speed by up to 8.5<inline-formula><tex-math>$\\\\times$</tex-math></inline-formula> depending on input sparsity. The SNN achieves a peak throughput of 4.7 GSOP/s at 0.9 V, consuming a minimum energy per synaptic operation (SOP) of 3.3 pJ at 0.55 V. A 4-layer feed-forward network is mapped onto the chip, classifying MNIST digits with 92.3% accuracy at 8.4K-classification/s and consuming 2.7-<inline-formula><tex-math>$\\\\mu$</tex-math></inline-formula>J/classification. Additionally, an audio recognition task mapped onto the chip achieves 87.4% accuracy at 215-<inline-formula><tex-math>$\\\\mu$</tex-math></inline-formula>J/classification.\",\"PeriodicalId\":94031,\"journal\":{\"name\":\"IEEE transactions on biomedical circuits and systems\",\"volume\":\"19 3\",\"pages\":\"523-535\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on biomedical circuits and systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10661301/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biomedical circuits and systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10661301/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
NEXUS: A 28nm 3.3pJ/SOP 16-Core Spiking Neural Network With a Diamond Topology for Real-Time Data Processing
The realization of brain-scale spiking neural networks (SNNs) is impeded by power constraints and low integration density. To address these challenges, multi-core SNNs are utilized to emulate numerous neurons with high energy efficiency, where spike packets are routed through a network-on-chip (NoC). However, the information can be lost in the NoC under high spike traffic conditions, leading to performance degradation. This work presents NEXUS, a 16-core SNN with a diamond-shaped NoC topology fabricated in 28-nm CMOS technology. It integrates 4096 leaky integrate-and-fire (LIF) neurons with 1M 4-bit synaptic weights, occupying an area of 2.16 mm2. The proposed NoC architecture is scalable to any network size, ensuring no data loss due to contending packets with a maximum routing latency of 5.1$\mu$s for 16 cores. The proposed congestion management method eliminates the need for FIFO in routers, resulting in a compact router footprint of 0.001 mm2. The proposed neurosynaptic core allows for increasing the processing speed by up to 8.5$\times$ depending on input sparsity. The SNN achieves a peak throughput of 4.7 GSOP/s at 0.9 V, consuming a minimum energy per synaptic operation (SOP) of 3.3 pJ at 0.55 V. A 4-layer feed-forward network is mapped onto the chip, classifying MNIST digits with 92.3% accuracy at 8.4K-classification/s and consuming 2.7-$\mu$J/classification. Additionally, an audio recognition task mapped onto the chip achieves 87.4% accuracy at 215-$\mu$J/classification.