A 1.40mm2 141mW 898GOPS sparse neuromorphic processor in 40nm CMOS

Phil C. Knag, Chester Liu, Zhengya Zhang
{"title":"A 1.40mm2 141mW 898GOPS sparse neuromorphic processor in 40nm CMOS","authors":"Phil C. Knag, Chester Liu, Zhengya Zhang","doi":"10.1109/VLSIC.2016.7573526","DOIUrl":null,"url":null,"abstract":"Sparsity is a brain-inspired property that enables a significant reduction in workload and power dissipation of deep learning. This work presents a 1.40mm2 40nm CMOS sparse neuromorphic processor that implements a two-layer convolutional restricted Boltzmann machine (CRBM) for inference and a support vector machine (SVM) classifier. The processor incorporates sparse convolvers to realize sparsity-proportional workload reduction. The architecture is parallelized along a non-sparse dimension to minimize stalling. At 0.9V and 240MHz, the processor achieves an effective 898.2GOPS performance, dissipating 140.9mW. Using sparsity, we reduce the workload, datapath power consumption and area by 3.4×, 3.3× and 1.74×, respectively. The design uses latch-based memory to reduce area and dynamic clock gating to save power.","PeriodicalId":6512,"journal":{"name":"2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits)","volume":"16 1","pages":"1-2"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSIC.2016.7573526","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Sparsity is a brain-inspired property that enables a significant reduction in workload and power dissipation of deep learning. This work presents a 1.40mm2 40nm CMOS sparse neuromorphic processor that implements a two-layer convolutional restricted Boltzmann machine (CRBM) for inference and a support vector machine (SVM) classifier. The processor incorporates sparse convolvers to realize sparsity-proportional workload reduction. The architecture is parallelized along a non-sparse dimension to minimize stalling. At 0.9V and 240MHz, the processor achieves an effective 898.2GOPS performance, dissipating 140.9mW. Using sparsity, we reduce the workload, datapath power consumption and area by 3.4×, 3.3× and 1.74×, respectively. The design uses latch-based memory to reduce area and dynamic clock gating to save power.
40nm CMOS 141mW 898GOPS稀疏神经形态处理器
稀疏性是一种受大脑启发的特性,它可以显著减少深度学习的工作量和功耗。这项工作提出了一个1.40mm2 40nm的CMOS稀疏神经形态处理器,该处理器实现了用于推理的两层卷积受限玻尔兹曼机(CRBM)和支持向量机(SVM)分类器。该处理器采用稀疏卷积来实现稀疏比例的工作量减少。该架构沿着非稀疏维度并行化,以最小化延迟。在0.9V和240MHz下,处理器实现了898.2GOPS的有效性能,功耗为140.9mW。通过使用稀疏性,我们将工作负载、数据路径功耗和面积分别降低了3.4倍、3.3倍和1.74倍。该设计采用基于锁存的存储器来减小内存面积,并采用动态时钟门控来节省功耗。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信