An Optimized Architecture For Decomposed Convolutional Neural Networks

Fangxuan Sun, Jun Lin, Zhongfeng Wang
{"title":"An Optimized Architecture For Decomposed Convolutional Neural Networks","authors":"Fangxuan Sun, Jun Lin, Zhongfeng Wang","doi":"10.1109/ISVLSI.2018.00100","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks (CNNs) have found extensive applications in various tasks. However, the state-of-the-art CNNs are both computation-intensive and memory-intensive, which brings tremendous hardware implementation challenges. Various methods have been proposed to reduce the model size and computation complexity of a CNN. Among them, when hardware implementation is considered, the Canonical Polyadic decomposition (CPD) method is more suitable due to the regularity in the decomposed filters. Moreover, the CPD method can be combined with widely used pruning methods to compress the model in further. In this paper, to the best of our knowledge, an efficient hardware architecture for CPD-CNNs is proposed for the first time based on a carefully designed data flow. In detail, a reconfigurable fast convolution unit is introduced to reduce the number of multiplications while handling some commonly-used convolution core operations. The proposed architecture is coded with RTL and synthesized under the TSMC 90nm CMOS technology. Our design achieves an equivalent throughput of more than 3TOP/s under 650MHz clock frequency.","PeriodicalId":114330,"journal":{"name":"2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISVLSI.2018.00100","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Convolutional neural networks (CNNs) have found extensive applications in various tasks. However, the state-of-the-art CNNs are both computation-intensive and memory-intensive, which brings tremendous hardware implementation challenges. Various methods have been proposed to reduce the model size and computation complexity of a CNN. Among them, when hardware implementation is considered, the Canonical Polyadic decomposition (CPD) method is more suitable due to the regularity in the decomposed filters. Moreover, the CPD method can be combined with widely used pruning methods to compress the model in further. In this paper, to the best of our knowledge, an efficient hardware architecture for CPD-CNNs is proposed for the first time based on a carefully designed data flow. In detail, a reconfigurable fast convolution unit is introduced to reduce the number of multiplications while handling some commonly-used convolution core operations. The proposed architecture is coded with RTL and synthesized under the TSMC 90nm CMOS technology. Our design achieves an equivalent throughput of more than 3TOP/s under 650MHz clock frequency.
分解卷积神经网络的优化结构
卷积神经网络(cnn)在各种任务中得到了广泛的应用。然而,最先进的cnn是计算密集型和内存密集型的,这给硬件实现带来了巨大的挑战。人们提出了各种方法来减小CNN的模型尺寸和计算复杂度。其中,在考虑硬件实现的情况下,规范多元分解(Canonical Polyadic decomposition, CPD)方法由于分解后的滤波器具有一定的规律性而更为适用。此外,CPD方法可以与广泛使用的剪枝方法相结合,进一步压缩模型。在本文中,据我们所知,基于精心设计的数据流,首次提出了一种高效的cpd - cnn硬件架构。在处理一些常用的卷积核心操作的同时,引入了一个可重构的快速卷积单元来减少乘法次数。该架构采用RTL编码,并在台积电90nm CMOS技术下进行合成。我们的设计在650MHz时钟频率下实现了超过3TOP/s的等效吞吐量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信