Hsin-Yu Ting, Tootiya Giyahchi, A. A. Sani, E. Bozorgzadeh
{"title":"Dynamic Sharing in Multi-accelerators of Neural Networks on an FPGA Edge Device","authors":"Hsin-Yu Ting, Tootiya Giyahchi, A. A. Sani, E. Bozorgzadeh","doi":"10.1109/ASAP49362.2020.00040","DOIUrl":null,"url":null,"abstract":"Edge computing can potentially provide abundant processing resources for compute-intensive applications while bringing services close to end devices. With the increasing demands for computing acceleration at the edge, FPGAs have been deployed to provide custom deep neural network accelerators. This paper explores a DNN accelerator sharing system at the edge FPGA device, that serves various DNN applications from multiple end devices simultaneously. The proposed SharedDNN/PlanAhead policy exploits the regularity among requests for various DNN accelerators and determines which accelerator to allocate for each request and in what order to respond to the requests that achieve maximum responsiveness for a queue of acceleration requests. Our results show overall 2. 20x performance gain at best and utilization improvement by reducing up to 27% of DNN library usage while staying within the requests’ requirements and resource constraints.","PeriodicalId":375691,"journal":{"name":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASAP49362.2020.00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Edge computing can potentially provide abundant processing resources for compute-intensive applications while bringing services close to end devices. With the increasing demands for computing acceleration at the edge, FPGAs have been deployed to provide custom deep neural network accelerators. This paper explores a DNN accelerator sharing system at the edge FPGA device, that serves various DNN applications from multiple end devices simultaneously. The proposed SharedDNN/PlanAhead policy exploits the regularity among requests for various DNN accelerators and determines which accelerator to allocate for each request and in what order to respond to the requests that achieve maximum responsiveness for a queue of acceleration requests. Our results show overall 2. 20x performance gain at best and utilization improvement by reducing up to 27% of DNN library usage while staying within the requests’ requirements and resource constraints.