Faquan Chen, Qingyang Tian, Lisheng Xie, Yifan Zhou, Ziren Wu, Liangshun Wu, Rendong Ying, Fei Wen, Peilin Liu
{"title":"EPOC: A 28-nm 5.3 pJ/SOP Event-driven Parallel Neuromorphic Hardware with Neuromodulation-based Online Learning.","authors":"Faquan Chen, Qingyang Tian, Lisheng Xie, Yifan Zhou, Ziren Wu, Liangshun Wu, Rendong Ying, Fei Wen, Peilin Liu","doi":"10.1109/TBCAS.2024.3470520","DOIUrl":null,"url":null,"abstract":"<p><p>Bio-inspired neuromorphic hardware with learning ability is highly promising to achieve human-like intelligence, particularly in terms of high energy efficiency and strong environmental adaptability. Though many customized prototypes have demonstrated learning ability, learning on neuromorphic hardware still lacks a bio-plausible and unified learning framework, and inherent spike-based sparsity and parallelism have not been fully exploited, which fundamentally limits their computational efficiency and scale. Therefore, we develop a unified, event-driven, and massively parallel multi-core neuromorphic online learning processor, namely EPOC. We present a neuromodulation-based neuromorphic online learning framework to unify various learning algorithms, and EPOC supports high-accuracy local/global supervised Spike Neural Network (SNN) learning with a low-memory-demand streaming single-sample learning strategy through different neuromodulator formulations. EPOC leverages a novel event-driven computation method that fully exploits spike-based sparsity throughout the forward-backward learning phases, and parallel multi-channel and multi-core computing architecture, bringing 9.9× time efficiency improvement compared with the baseline architecture. We synthesize EPOC in a 28-nm CMOS process and perform extensive benchmarking. EPOC achieves state-of-the-art learning accuracy of 99.2%, 98.2%, and 94.3% on the MNIST, NMNIST, and DVS-Gesture benchmarks, respectively. Local-learning EPOC achieves 2.9× time efficiency improvement compared with the global learning counterpart. EPOC operates at a typical clock frequency of 100 MHz, providing a peak 328 GOPS/51 GSOPS throughput and a 5.3 pJ/SOP energy efficiency.</p>","PeriodicalId":94031,"journal":{"name":"IEEE transactions on biomedical circuits and systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biomedical circuits and systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TBCAS.2024.3470520","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Bio-inspired neuromorphic hardware with learning ability is highly promising to achieve human-like intelligence, particularly in terms of high energy efficiency and strong environmental adaptability. Though many customized prototypes have demonstrated learning ability, learning on neuromorphic hardware still lacks a bio-plausible and unified learning framework, and inherent spike-based sparsity and parallelism have not been fully exploited, which fundamentally limits their computational efficiency and scale. Therefore, we develop a unified, event-driven, and massively parallel multi-core neuromorphic online learning processor, namely EPOC. We present a neuromodulation-based neuromorphic online learning framework to unify various learning algorithms, and EPOC supports high-accuracy local/global supervised Spike Neural Network (SNN) learning with a low-memory-demand streaming single-sample learning strategy through different neuromodulator formulations. EPOC leverages a novel event-driven computation method that fully exploits spike-based sparsity throughout the forward-backward learning phases, and parallel multi-channel and multi-core computing architecture, bringing 9.9× time efficiency improvement compared with the baseline architecture. We synthesize EPOC in a 28-nm CMOS process and perform extensive benchmarking. EPOC achieves state-of-the-art learning accuracy of 99.2%, 98.2%, and 94.3% on the MNIST, NMNIST, and DVS-Gesture benchmarks, respectively. Local-learning EPOC achieves 2.9× time efficiency improvement compared with the global learning counterpart. EPOC operates at a typical clock frequency of 100 MHz, providing a peak 328 GOPS/51 GSOPS throughput and a 5.3 pJ/SOP energy efficiency.