A Tile-based Fused-layer CNN Accelerator for FPGAs

Fabrizio Indirli, Ahmet Erdem, C. Silvano
{"title":"A Tile-based Fused-layer CNN Accelerator for FPGAs","authors":"Fabrizio Indirli, Ahmet Erdem, C. Silvano","doi":"10.1109/ICECS49266.2020.9294981","DOIUrl":null,"url":null,"abstract":"The acceleration of Convolutional Neural Networks (CNNs) on FPGAs is becoming increasingly popular for computer vision tasks. However, the limited memory and bandwidth of these devices pose some challenges to the design of conventional CNN accelerators, which use external DRAM to store the intermediate results of each layer. To mitigate these criticalities, researchers have proposed the fused-layer methodology, which diminishes the accesses to the external DRAM by accelerating simultaneously multiple subsequent layers on the same chip. In this work, we propose a configurable fused-layer accelerator that exploits output tiling and the half-precision float datatype to reduce resource utilization. We assessed its effectiveness with experiments on VGG-16 and Yolo-Lite CNNs, targeting a Xilinx Zynq ZU6EG FPGA. Our design achieved up to 42% speedup and up to 95% fewer transfers from external memory compared to a single-layer baseline solution. Moreover, to ease and quicken the design space exploration, we developed a Machine Learning model that predicts the performance and the resource utilization of our accelerator with an accuracy > 90% on the reported dataset.","PeriodicalId":404022,"journal":{"name":"2020 27th IEEE International Conference on Electronics, Circuits and Systems (ICECS)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 27th IEEE International Conference on Electronics, Circuits and Systems (ICECS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECS49266.2020.9294981","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The acceleration of Convolutional Neural Networks (CNNs) on FPGAs is becoming increasingly popular for computer vision tasks. However, the limited memory and bandwidth of these devices pose some challenges to the design of conventional CNN accelerators, which use external DRAM to store the intermediate results of each layer. To mitigate these criticalities, researchers have proposed the fused-layer methodology, which diminishes the accesses to the external DRAM by accelerating simultaneously multiple subsequent layers on the same chip. In this work, we propose a configurable fused-layer accelerator that exploits output tiling and the half-precision float datatype to reduce resource utilization. We assessed its effectiveness with experiments on VGG-16 and Yolo-Lite CNNs, targeting a Xilinx Zynq ZU6EG FPGA. Our design achieved up to 42% speedup and up to 95% fewer transfers from external memory compared to a single-layer baseline solution. Moreover, to ease and quicken the design space exploration, we developed a Machine Learning model that predicts the performance and the resource utilization of our accelerator with an accuracy > 90% on the reported dataset.
基于tile的fpga融合层CNN加速器
卷积神经网络(cnn)在fpga上的加速在计算机视觉任务中越来越受欢迎。然而,这些设备有限的内存和带宽给传统的CNN加速器的设计带来了一些挑战,传统的CNN加速器使用外部DRAM来存储每层的中间结果。为了缓解这些问题,研究人员提出了融合层方法,该方法通过同时加速同一芯片上的多个后续层来减少对外部DRAM的访问。在这项工作中,我们提出了一个可配置的融合层加速器,利用输出平铺和半精度浮点数据类型来降低资源利用率。我们以Xilinx Zynq ZU6EG FPGA为目标,在VGG-16和Yolo-Lite cnn上进行了实验,以评估其有效性。与单层基线解决方案相比,我们的设计实现了高达42%的加速和高达95%的外部内存传输减少。此外,为了简化和加快设计空间探索,我们开发了一个机器学习模型,该模型在报告的数据集上预测加速器的性能和资源利用率,准确率> 90%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信