3D-aCortex: An Ultra-Compact Energy-Efficient Neurocomputing Platform Based on Commercial 3D-NAND Flash Memories

Mohammad Bavandpour, Shubham Sahay, M. Mahmoodi, D. Strukov
{"title":"3D-aCortex: An Ultra-Compact Energy-Efficient Neurocomputing Platform Based on Commercial 3D-NAND Flash Memories","authors":"Mohammad Bavandpour, Shubham Sahay, M. Mahmoodi, D. Strukov","doi":"10.1088/2634-4386/AC0775","DOIUrl":null,"url":null,"abstract":"The first contribution of this paper is the development of extremely dense, energy-efficient mixed-signal vector-by-matrix-multiplication (VMM) circuits based on the existing 3D-NAND flash memory blocks, without any need for their modification. Such compatibility is achieved using time-domain-encoded VMM design. Our detailed simulations have shown that, for example, the 5-bit VMM of 200-element vectors, using the commercially available 64-layer gate-all-around macaroni-type 3D-NAND memory blocks designed in the 55-nm technology node, may provide an unprecedented area efficiency of 0.14 um2/byte and energy efficiency of ~10 fJ/Op, including the input/output and other peripheral circuitry overheads. Our second major contribution is the development of 3D-aCortex, a multi-purpose neuromorphic inference processor that utilizes the proposed 3D-VMM blocks as its core processing units. We have performed rigorous performance simulations of such a processor on both circuit and system levels, taking into account non-idealities such as drain-induced barrier lowering, capacitive coupling, charge injection, parasitics, process variations, and noise. Our modeling of the 3D-aCortex performing several state-of-the-art neuromorphic-network benchmarks has shown that it may provide the record-breaking storage efficiency of 4.34 MB/mm2, the peak energy efficiency of 70.43 TOps/J, and the computational throughput up to 10.66 TOps/s. The storage efficiency can be further improved seven-fold by aggressively sharing VMM peripheral circuits at the cost of slight decrease in energy efficiency and throughput.","PeriodicalId":114772,"journal":{"name":"Neuromorph. Comput. Eng.","volume":"137 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neuromorph. Comput. Eng.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2634-4386/AC0775","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

Abstract

The first contribution of this paper is the development of extremely dense, energy-efficient mixed-signal vector-by-matrix-multiplication (VMM) circuits based on the existing 3D-NAND flash memory blocks, without any need for their modification. Such compatibility is achieved using time-domain-encoded VMM design. Our detailed simulations have shown that, for example, the 5-bit VMM of 200-element vectors, using the commercially available 64-layer gate-all-around macaroni-type 3D-NAND memory blocks designed in the 55-nm technology node, may provide an unprecedented area efficiency of 0.14 um2/byte and energy efficiency of ~10 fJ/Op, including the input/output and other peripheral circuitry overheads. Our second major contribution is the development of 3D-aCortex, a multi-purpose neuromorphic inference processor that utilizes the proposed 3D-VMM blocks as its core processing units. We have performed rigorous performance simulations of such a processor on both circuit and system levels, taking into account non-idealities such as drain-induced barrier lowering, capacitive coupling, charge injection, parasitics, process variations, and noise. Our modeling of the 3D-aCortex performing several state-of-the-art neuromorphic-network benchmarks has shown that it may provide the record-breaking storage efficiency of 4.34 MB/mm2, the peak energy efficiency of 70.43 TOps/J, and the computational throughput up to 10.66 TOps/s. The storage efficiency can be further improved seven-fold by aggressively sharing VMM peripheral circuits at the cost of slight decrease in energy efficiency and throughput.
3D-aCortex:一个基于商用3D-NAND闪存的超紧凑节能神经计算平台
本文的第一个贡献是基于现有的3D-NAND闪存块开发了极其密集、节能的混合信号矢量矩阵乘法(VMM)电路,而无需对其进行任何修改。这种兼容性是使用时域编码VMM设计实现的。我们的详细模拟表明,例如,采用55纳米技术节点设计的64层栅极全通面型3D-NAND存储块,200元矢量的5位VMM可以提供前所未有的0.14 um2/byte的面积效率和~10 fJ/Op的能量效率,包括输入/输出和其他外围电路开销。我们的第二个主要贡献是3D-aCortex的开发,这是一种多用途的神经形态推理处理器,利用提出的3D-VMM模块作为其核心处理单元。我们在电路和系统层面对这种处理器进行了严格的性能模拟,考虑到非理想情况,如漏极诱导的阻挡降低、电容耦合、电荷注入、寄生、工艺变化和噪声。我们对3D-aCortex进行了几次最先进的神经形态网络基准测试,结果表明,它可以提供创纪录的4.34 MB/mm2的存储效率,70.43 TOps/J的峰值能量效率,以及高达10.66 TOps/s的计算吞吐量。通过积极共享VMM外围电路,存储效率可以进一步提高7倍,但代价是能量效率和吞吐量略有下降。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信