A-DSCNN: Depthwise Separable Convolutional Neural Network Inference Chip Design Using an Approximate Multiplier

Jin-Jia Shang, Nicholas Phipps, I-Chyn Wey, T. Teo
{"title":"A-DSCNN: Depthwise Separable Convolutional Neural Network Inference Chip Design Using an Approximate Multiplier","authors":"Jin-Jia Shang, Nicholas Phipps, I-Chyn Wey, T. Teo","doi":"10.3390/chips2030010","DOIUrl":null,"url":null,"abstract":"For Convolutional Neural Networks (CNNs), Depthwise Separable CNN (DSCNN) is the preferred architecture for Application Specific Integrated Circuit (ASIC) implementation on edge devices. It benefits from a multi-mode approximate multiplier proposed in this work. The proposed approximate multiplier uses two 4-bit multiplication operations to implement a 12-bit multiplication operation by reusing the same multiplier array. With this approximate multiplier, sequential multiplication operations are pipelined in a modified DSCNN to fully utilize the Processing Element (PE) array in the convolutional layer. Two versions of Approximate-DSCNN (A-DSCNN) accelerators were implemented on TSMC 40 nm CMOS process with a supply voltage of 0.9 V. At a clock frequency of 200 MHz, the designs achieve 4.78 GOPs/mW and 4.89 GOP/mW power efficiency while occupying 1.16 mm2 and 0.398 mm2 area, respectively.","PeriodicalId":6666,"journal":{"name":"2015 IEEE Hot Chips 27 Symposium (HCS)","volume":"16 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Hot Chips 27 Symposium (HCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/chips2030010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

For Convolutional Neural Networks (CNNs), Depthwise Separable CNN (DSCNN) is the preferred architecture for Application Specific Integrated Circuit (ASIC) implementation on edge devices. It benefits from a multi-mode approximate multiplier proposed in this work. The proposed approximate multiplier uses two 4-bit multiplication operations to implement a 12-bit multiplication operation by reusing the same multiplier array. With this approximate multiplier, sequential multiplication operations are pipelined in a modified DSCNN to fully utilize the Processing Element (PE) array in the convolutional layer. Two versions of Approximate-DSCNN (A-DSCNN) accelerators were implemented on TSMC 40 nm CMOS process with a supply voltage of 0.9 V. At a clock frequency of 200 MHz, the designs achieve 4.78 GOPs/mW and 4.89 GOP/mW power efficiency while occupying 1.16 mm2 and 0.398 mm2 area, respectively.
基于近似乘法器的深度可分离卷积神经网络推理芯片设计
对于卷积神经网络(CNN),深度可分离CNN (DSCNN)是在边缘设备上实现专用集成电路(ASIC)的首选架构。它得益于本工作中提出的多模近似乘法器。所建议的近似乘法器通过重用相同的乘法器数组,使用两个4位乘法操作来实现一个12位乘法操作。利用这个近似乘法器,顺序乘法运算在一个改进的DSCNN中被流水线化,以充分利用卷积层中的处理元素(PE)数组。两个版本的Approximate-DSCNN (a - dscnn)加速器在TSMC 40 nm CMOS工艺上实现,电源电压为0.9 V。在时钟频率为200mhz时,功耗效率分别为4.78 GOPs/mW和4.89 GOP/mW,功耗面积分别为1.16 mm2和0.398 mm2。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信