MOCCA:在单片3D中使用cnfet的过程变化容忍收缩DNN加速器

Samuel J. Engers, Cheng Chu, Dawen Xu, Ying Wang, Fan Chen
{"title":"MOCCA:在单片3D中使用cnfet的过程变化容忍收缩DNN加速器","authors":"Samuel J. Engers, Cheng Chu, Dawen Xu, Ying Wang, Fan Chen","doi":"10.1145/3526241.3530380","DOIUrl":null,"url":null,"abstract":"Hardware accelerators based on systolic arrays have become the dominant method for efficient processing of deep neural networks (DNNs). Although such designs provide significant performance improvement compared to its contemporary CPUs or GPUs, their power efficiency and area efficiency are greatly limited by the large computing array and on-chip memory. In this work, we demonstrate that we can further improve the efficiency of systolic accelerators using emerging carbon nanotube field-effect transistors (CNFETs) by stacking the computing logic and on-chip memory on multiple layers and utilizing monolithic 3D (M3D) vias for low-latency communication. We comprehensively explore the design space and present MOCCA, the first process variation tolerable CNFET-based systolic DNN accelerator. We validate MOCCA against previous 2D accelerators on state-of-the-arts DNN models. On average, MOCCA achieves the same throughput with 6.12× and 2.12× improvement respectively on performance and power efficiency in a 2× reduced chip footprint.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"MOCCA: A Process Variation Tolerant Systolic DNN Accelerator using CNFETs in Monolithic 3D\",\"authors\":\"Samuel J. Engers, Cheng Chu, Dawen Xu, Ying Wang, Fan Chen\",\"doi\":\"10.1145/3526241.3530380\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hardware accelerators based on systolic arrays have become the dominant method for efficient processing of deep neural networks (DNNs). Although such designs provide significant performance improvement compared to its contemporary CPUs or GPUs, their power efficiency and area efficiency are greatly limited by the large computing array and on-chip memory. In this work, we demonstrate that we can further improve the efficiency of systolic accelerators using emerging carbon nanotube field-effect transistors (CNFETs) by stacking the computing logic and on-chip memory on multiple layers and utilizing monolithic 3D (M3D) vias for low-latency communication. We comprehensively explore the design space and present MOCCA, the first process variation tolerable CNFET-based systolic DNN accelerator. We validate MOCCA against previous 2D accelerators on state-of-the-arts DNN models. On average, MOCCA achieves the same throughput with 6.12× and 2.12× improvement respectively on performance and power efficiency in a 2× reduced chip footprint.\",\"PeriodicalId\":188228,\"journal\":{\"name\":\"Proceedings of the Great Lakes Symposium on VLSI 2022\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Great Lakes Symposium on VLSI 2022\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3526241.3530380\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Great Lakes Symposium on VLSI 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3526241.3530380","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

基于收缩阵列的硬件加速器已成为深度神经网络高效处理的主流方法。虽然这样的设计与当代的cpu或gpu相比提供了显著的性能改进,但它们的功率效率和面积效率受到大型计算阵列和片上存储器的极大限制。在这项工作中,我们证明了通过将计算逻辑和片上存储器堆叠在多层上,并利用单片3D (M3D)通孔进行低延迟通信,我们可以进一步提高收缩加速器的效率。我们全面探索了设计空间,并提出了MOCCA,这是第一个基于cnfet的可容忍工艺变化的收缩DNN加速器。我们在最先进的DNN模型上验证了MOCCA与先前2D加速器的对比。平均而言,MOCCA实现了相同的吞吐量,性能和功率效率分别提高了6.12倍和2.12倍,芯片占地面积减少了2倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MOCCA: A Process Variation Tolerant Systolic DNN Accelerator using CNFETs in Monolithic 3D
Hardware accelerators based on systolic arrays have become the dominant method for efficient processing of deep neural networks (DNNs). Although such designs provide significant performance improvement compared to its contemporary CPUs or GPUs, their power efficiency and area efficiency are greatly limited by the large computing array and on-chip memory. In this work, we demonstrate that we can further improve the efficiency of systolic accelerators using emerging carbon nanotube field-effect transistors (CNFETs) by stacking the computing logic and on-chip memory on multiple layers and utilizing monolithic 3D (M3D) vias for low-latency communication. We comprehensively explore the design space and present MOCCA, the first process variation tolerable CNFET-based systolic DNN accelerator. We validate MOCCA against previous 2D accelerators on state-of-the-arts DNN models. On average, MOCCA achieves the same throughput with 6.12× and 2.12× improvement respectively on performance and power efficiency in a 2× reduced chip footprint.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信