AdaFlow: A Framework for Adaptive Dataflow CNN Acceleration on FPGAs

Guilherme Korol, M. Jordan, M. B. Rutzig, A. C. S. Beck
{"title":"AdaFlow: A Framework for Adaptive Dataflow CNN Acceleration on FPGAs","authors":"Guilherme Korol, M. Jordan, M. B. Rutzig, A. C. S. Beck","doi":"10.23919/DATE54114.2022.9774727","DOIUrl":null,"url":null,"abstract":"To meet latency and privacy requirements, resource-hungry deep learning applications have been migrating to the Edge, where IoT devices can offload the inference processing to local Edge servers. Since FPGAs have successfully accelerated an increasing number of deep learning applications (especially CNN-based ones), they emerge as an effective alternative for Edge platforms. However, Edge applications may present highly unpredictable workloads, requiring runtime adaptability in the inference processing. Although some works apply model switching on CPU and GPU platforms by exploiting different pruning rates at runtime, so the inference can adapt according to some quality-performance trade-off, FPGA-based accelerators refrain from this approach since they are synthesized to specific CNN models. In this context, this work enables model switching on FPGAs by adding to the well-known FINN accelerator an extra level of adaptability (i.e., flexibility) and support to the dynamic use of pruning via fast model switch on flexible accelerators, at the cost of some extra logic, or via FPGA reconfigurations of fixed accelerators. From that, we developed AdaFlow: a framework that automatically builds, at design time, a library from these new available versions (flexible and fixed, pruned or not) that will be used, at runtime, to dynamically select a given version according to a user-configurable accuracy threshold and current workload conditions. We have evaluated AdaFlow under a smart Edge surveillance application with two CNN models and two datasets, showing that AdaFlow processes, on average, 1.3× more inferences and increases, on average, 1.4× the power efficiency over state-of-the-art statically deployed dataflow accelerators.","PeriodicalId":232583,"journal":{"name":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"201 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/DATE54114.2022.9774727","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

To meet latency and privacy requirements, resource-hungry deep learning applications have been migrating to the Edge, where IoT devices can offload the inference processing to local Edge servers. Since FPGAs have successfully accelerated an increasing number of deep learning applications (especially CNN-based ones), they emerge as an effective alternative for Edge platforms. However, Edge applications may present highly unpredictable workloads, requiring runtime adaptability in the inference processing. Although some works apply model switching on CPU and GPU platforms by exploiting different pruning rates at runtime, so the inference can adapt according to some quality-performance trade-off, FPGA-based accelerators refrain from this approach since they are synthesized to specific CNN models. In this context, this work enables model switching on FPGAs by adding to the well-known FINN accelerator an extra level of adaptability (i.e., flexibility) and support to the dynamic use of pruning via fast model switch on flexible accelerators, at the cost of some extra logic, or via FPGA reconfigurations of fixed accelerators. From that, we developed AdaFlow: a framework that automatically builds, at design time, a library from these new available versions (flexible and fixed, pruned or not) that will be used, at runtime, to dynamically select a given version according to a user-configurable accuracy threshold and current workload conditions. We have evaluated AdaFlow under a smart Edge surveillance application with two CNN models and two datasets, showing that AdaFlow processes, on average, 1.3× more inferences and increases, on average, 1.4× the power efficiency over state-of-the-art statically deployed dataflow accelerators.
AdaFlow: fpga上自适应数据流CNN加速框架
为了满足延迟和隐私要求,资源匮乏的深度学习应用程序已经迁移到边缘,物联网设备可以将推理处理卸载到本地边缘服务器。由于fpga已经成功地加速了越来越多的深度学习应用程序(特别是基于cnn的应用程序),因此它们成为边缘平台的有效替代方案。然而,边缘应用程序可能会呈现高度不可预测的工作负载,在推理处理中需要运行时适应性。虽然有些作品通过在运行时利用不同的修剪速率在CPU和GPU平台上应用模型切换,因此推理可以根据一些质量-性能权衡进行调整,但基于fpga的加速器不采用这种方法,因为它们是合成到特定的CNN模型的。在这种情况下,这项工作通过在著名的FINN加速器上添加额外的适应性(即灵活性)来实现FPGA上的模型切换,并支持通过灵活加速器上的快速模型切换动态使用修剪,以一些额外的逻辑为代价,或者通过FPGA重新配置固定加速器。在此基础上,我们开发了AdaFlow:一个在设计时自动构建这些新版本(灵活和固定,修剪或不修剪)库的框架,该框架将在运行时使用,根据用户可配置的精度阈值和当前工作负载条件动态选择给定的版本。我们使用两个CNN模型和两个数据集在智能边缘监控应用程序下对AdaFlow进行了评估,结果表明,与最先进的静态部署数据流加速器相比,AdaFlow平均处理的推断量增加了1.3倍,平均提高了1.4倍的功率效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信