Toward Fast Platform-Aware Neural Architecture Search for FPGA-Accelerated Edge AI Applications

Yi-Chuan Liang, Ying-Chiao Liao, Chen-Ching Lin, Shih-Hao Hung
{"title":"Toward Fast Platform-Aware Neural Architecture Search for FPGA-Accelerated Edge AI Applications","authors":"Yi-Chuan Liang, Ying-Chiao Liao, Chen-Ching Lin, Shih-Hao Hung","doi":"10.1145/3400286.3418240","DOIUrl":null,"url":null,"abstract":"Neural Architecture Search (NAS) is a technique for finding suitable neural network architecture models for given applications. Previously, such search methods are usually based on reinforcement learning, with a recurrent neural network to generate neural network models. However, most NAS methods aim to find a set of candidates with best cost-performance ratios, e.g. high accuracy and low computing time, based on rough estimates derived from the workload generically. As today's deep learning chips accelerate neural network operations with a variety of hardware tricks such as vectors and low-precision data formats, the estimated metrics derived from generic computing operations such as float-point operations (FLOPS) would be very different from the actual latency, throughput, power consumption, etc., which are highly sensitive to the hardware design and even the software optimization in edge AI applications. Thus, instead of taking a long time to pick and train so called good candidates repeatedly based on unreliable estimates, we propose a NAS framework which accelerates the search process by including the actual performance measurements in the search process. The inclusion of actual measurements enables the proposed NAS framework to find candidates based on correct information and reduce the possibility of selecting wrong candidates and wasting search time on wrong candidates. To illustrate the effectiveness of our framework, we prototyped the framework to work with Intel OpenVINO and Field Programmable Gate Arrays (FPGA) to meet the accuracy and latency required by the user. The framework takes the dataset, accuracy and latency requirements from the user and automatically search for candidates to meet the requirements. Case studies and experimental results are presented in this paper to evaluate the effectiveness of our framework for Edge AI applications in real-time image classification.","PeriodicalId":326100,"journal":{"name":"Proceedings of the International Conference on Research in Adaptive and Convergent Systems","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Conference on Research in Adaptive and Convergent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3400286.3418240","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Neural Architecture Search (NAS) is a technique for finding suitable neural network architecture models for given applications. Previously, such search methods are usually based on reinforcement learning, with a recurrent neural network to generate neural network models. However, most NAS methods aim to find a set of candidates with best cost-performance ratios, e.g. high accuracy and low computing time, based on rough estimates derived from the workload generically. As today's deep learning chips accelerate neural network operations with a variety of hardware tricks such as vectors and low-precision data formats, the estimated metrics derived from generic computing operations such as float-point operations (FLOPS) would be very different from the actual latency, throughput, power consumption, etc., which are highly sensitive to the hardware design and even the software optimization in edge AI applications. Thus, instead of taking a long time to pick and train so called good candidates repeatedly based on unreliable estimates, we propose a NAS framework which accelerates the search process by including the actual performance measurements in the search process. The inclusion of actual measurements enables the proposed NAS framework to find candidates based on correct information and reduce the possibility of selecting wrong candidates and wasting search time on wrong candidates. To illustrate the effectiveness of our framework, we prototyped the framework to work with Intel OpenVINO and Field Programmable Gate Arrays (FPGA) to meet the accuracy and latency required by the user. The framework takes the dataset, accuracy and latency requirements from the user and automatically search for candidates to meet the requirements. Case studies and experimental results are presented in this paper to evaluate the effectiveness of our framework for Edge AI applications in real-time image classification.
面向fpga加速边缘人工智能应用的快速平台感知神经架构搜索
神经结构搜索(NAS)是一种为给定应用寻找合适的神经网络结构模型的技术。以前,这种搜索方法通常是基于强化学习,用递归神经网络生成神经网络模型。然而,大多数NAS方法的目标是找到一组具有最佳性价比的候选方法,例如,基于从一般工作负载中得出的粗略估计,高精度和低计算时间。由于当今的深度学习芯片通过各种硬件技巧(如矢量和低精度数据格式)加速神经网络运算,因此浮点运算(FLOPS)等通用计算运算得出的估计指标与实际的延迟、吞吐量、功耗等指标存在很大差异,这些指标对边缘人工智能应用中的硬件设计甚至软件优化都非常敏感。因此,我们提出了一个NAS框架,它通过在搜索过程中包含实际性能测量来加速搜索过程,而不是花费很长时间来根据不可靠的估计反复挑选和训练所谓的优秀候选者。实际测量的包含使所提出的NAS框架能够根据正确的信息找到候选项,并减少选择错误候选项和在错误候选项上浪费搜索时间的可能性。为了说明我们的框架的有效性,我们对框架进行了原型设计,使其与Intel OpenVINO和现场可编程门阵列(FPGA)一起工作,以满足用户所需的精度和延迟。该框架从用户那里获取数据集、精度和延迟要求,并自动搜索满足要求的候选数据。本文给出了案例研究和实验结果,以评估我们的边缘人工智能框架在实时图像分类中的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信