一种可编程的事件驱动体系结构用于评估峰值神经网络

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED) Pub Date : 2017-07-01 DOI:10.1109/ISLPED.2017.8009176

Arnab Roy, Swagath Venkataramani, Neel Gala, Sanchari Sen, Kamakoti Veezhinathan, A. Raghunathan

{"title":"一种可编程的事件驱动体系结构用于评估峰值神经网络","authors":"Arnab Roy, Swagath Venkataramani, Neel Gala, Sanchari Sen, Kamakoti Veezhinathan, A. Raghunathan","doi":"10.1109/ISLPED.2017.8009176","DOIUrl":null,"url":null,"abstract":"Spiking neural networks (SNNs) represent the third generation of neural networks and are expected to enable new classes of machine learning applications. However, evaluating large-scale SNNs (e.g., of the scale of the visual cortex) on power-constrained systems requires significant improvements in computing efficiency. A unique attribute of SNNs is their event-driven nature—information is encoded as a series of spikes, and work is dynamically generated as spikes propagate through the network. Therefore, parallel implementations of SNNs on multi-cores and GPGPUs are severely limited by communication and synchronization overheads. Recent years have seen great interest in deep learning accelerators for non-spiking neural networks, however, these architectures are not well suited to the dynamic, irregular parallelism in SNNs. Prior efforts on specialized SNN hardware utilize spatial architectures, wherein each neuron is allocated a dedicated processing element, and large networks are realized by connecting multiple chips into a system. While suitable for large-scale systems, this approach is not a good match to size or cost constrained mobile devices. We propose PEASE, a Programmable Event-driven processor Architecture for SNN Evaluation. PEASE comprises of Spike Processing Units (SPUs) that are dynamically scheduled to execute computations triggered by a spike. Instructions to the SPUs are dynamically generated by Spike Schedulers (SSs) that utilize event queues to track unprocessed spikes and identify neurons that need to be evaluated. The memory hierarchy in PEASE is fully software managed, and the processing elements are interconnected using a two-tiered bus-ring topology matching the communication characteristics of SNNs. We propose a method to map any given SNN to PEASE such that the workload is balanced across SPUs and SPU clusters, while pipelining across layers of the network to improve performance. We implemented PEASE at the RTL level and synthesized it to IBM 45 technology. Across 6 SNN benchmarks, our 64-SPU configuration of PEASE achieves 7.1×−17.5× and 2.6×−5.8× speedups, respectively, over software implementations on an Intel Xeon E5-2680 CPU and NVIDIA Tesla K40C GPU. The energy reductions over the CPU and GPU are 71×−179× and 198×−467×, respectively.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"A Programmable Event-driven Architecture for Evaluating Spiking Neural Networks\",\"authors\":\"Arnab Roy, Swagath Venkataramani, Neel Gala, Sanchari Sen, Kamakoti Veezhinathan, A. Raghunathan\",\"doi\":\"10.1109/ISLPED.2017.8009176\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spiking neural networks (SNNs) represent the third generation of neural networks and are expected to enable new classes of machine learning applications. However, evaluating large-scale SNNs (e.g., of the scale of the visual cortex) on power-constrained systems requires significant improvements in computing efficiency. A unique attribute of SNNs is their event-driven nature—information is encoded as a series of spikes, and work is dynamically generated as spikes propagate through the network. Therefore, parallel implementations of SNNs on multi-cores and GPGPUs are severely limited by communication and synchronization overheads. Recent years have seen great interest in deep learning accelerators for non-spiking neural networks, however, these architectures are not well suited to the dynamic, irregular parallelism in SNNs. Prior efforts on specialized SNN hardware utilize spatial architectures, wherein each neuron is allocated a dedicated processing element, and large networks are realized by connecting multiple chips into a system. While suitable for large-scale systems, this approach is not a good match to size or cost constrained mobile devices. We propose PEASE, a Programmable Event-driven processor Architecture for SNN Evaluation. PEASE comprises of Spike Processing Units (SPUs) that are dynamically scheduled to execute computations triggered by a spike. Instructions to the SPUs are dynamically generated by Spike Schedulers (SSs) that utilize event queues to track unprocessed spikes and identify neurons that need to be evaluated. The memory hierarchy in PEASE is fully software managed, and the processing elements are interconnected using a two-tiered bus-ring topology matching the communication characteristics of SNNs. We propose a method to map any given SNN to PEASE such that the workload is balanced across SPUs and SPU clusters, while pipelining across layers of the network to improve performance. We implemented PEASE at the RTL level and synthesized it to IBM 45 technology. Across 6 SNN benchmarks, our 64-SPU configuration of PEASE achieves 7.1×−17.5× and 2.6×−5.8× speedups, respectively, over software implementations on an Intel Xeon E5-2680 CPU and NVIDIA Tesla K40C GPU. The energy reductions over the CPU and GPU are 71×−179× and 198×−467×, respectively.\",\"PeriodicalId\":385714,\"journal\":{\"name\":\"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISLPED.2017.8009176\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISLPED.2017.8009176","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

摘要

峰值神经网络(snn)代表了第三代神经网络，有望实现新型机器学习应用。然而，在功率受限的系统上评估大规模snn(例如，视觉皮层的规模)需要显著提高计算效率。snn的一个独特属性是它们的事件驱动性质——信息被编码为一系列峰值，当峰值在网络中传播时，工作是动态生成的。因此，snn在多核和gpgpu上的并行实现受到通信和同步开销的严重限制。先前在专用SNN硬件上的努力利用空间架构，其中每个神经元被分配一个专用的处理元素，并且通过将多个芯片连接到一个系统中来实现大型网络。虽然适合于大型系统，但这种方法并不适合大小或成本受限的移动设备。我们提出了一种用于SNN评估的可编程事件驱动处理器体系结构PEASE。PEASE由峰值处理单元(spu)组成，spu被动态调度以执行由峰值触发的计算。spu的指令是由Spike Schedulers (ss)动态生成的，它利用事件队列跟踪未处理的Spike并识别需要评估的神经元。PEASE中的内存层次结构完全由软件管理，处理元素使用与snn通信特性相匹配的两层总线环拓扑进行互连。我们提出了一种将任何给定的SNN映射到PEASE的方法，以便在SPU和SPU集群之间平衡工作负载，同时跨网络层进行流水线以提高性能。我们在RTL级别实现了PEASE，并将其综合到IBM 45技术中。在6个SNN基准测试中，我们的64-SPU配置PEASE在Intel Xeon E5-2680 CPU和NVIDIA Tesla K40C GPU上分别实现了7.1 x - 17.5 x和2.6 x - 5.8 x的速度。与CPU和GPU相比，节能效果分别为71x ~ 179x和198x ~ 467x。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Programmable Event-driven Architecture for Evaluating Spiking Neural Networks

Spiking neural networks (SNNs) represent the third generation of neural networks and are expected to enable new classes of machine learning applications. However, evaluating large-scale SNNs (e.g., of the scale of the visual cortex) on power-constrained systems requires significant improvements in computing efficiency. A unique attribute of SNNs is their event-driven nature—information is encoded as a series of spikes, and work is dynamically generated as spikes propagate through the network. Therefore, parallel implementations of SNNs on multi-cores and GPGPUs are severely limited by communication and synchronization overheads. Recent years have seen great interest in deep learning accelerators for non-spiking neural networks, however, these architectures are not well suited to the dynamic, irregular parallelism in SNNs. Prior efforts on specialized SNN hardware utilize spatial architectures, wherein each neuron is allocated a dedicated processing element, and large networks are realized by connecting multiple chips into a system. While suitable for large-scale systems, this approach is not a good match to size or cost constrained mobile devices. We propose PEASE, a Programmable Event-driven processor Architecture for SNN Evaluation. PEASE comprises of Spike Processing Units (SPUs) that are dynamically scheduled to execute computations triggered by a spike. Instructions to the SPUs are dynamically generated by Spike Schedulers (SSs) that utilize event queues to track unprocessed spikes and identify neurons that need to be evaluated. The memory hierarchy in PEASE is fully software managed, and the processing elements are interconnected using a two-tiered bus-ring topology matching the communication characteristics of SNNs. We propose a method to map any given SNN to PEASE such that the workload is balanced across SPUs and SPU clusters, while pipelining across layers of the network to improve performance. We implemented PEASE at the RTL level and synthesized it to IBM 45 technology. Across 6 SNN benchmarks, our 64-SPU configuration of PEASE achieves 7.1×−17.5× and 2.6×−5.8× speedups, respectively, over software implementations on an Intel Xeon E5-2680 CPU and NVIDIA Tesla K40C GPU. The energy reductions over the CPU and GPU are 71×−179× and 198×−467×, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

自引率

0.00%

发文量