Hsin-Yu Ting, Tootiya Giyahchi, A. A. Sani, E. Bozorgzadeh
{"title":"FPGA边缘器件上神经网络多加速器的动态共享","authors":"Hsin-Yu Ting, Tootiya Giyahchi, A. A. Sani, E. Bozorgzadeh","doi":"10.1109/ASAP49362.2020.00040","DOIUrl":null,"url":null,"abstract":"Edge computing can potentially provide abundant processing resources for compute-intensive applications while bringing services close to end devices. With the increasing demands for computing acceleration at the edge, FPGAs have been deployed to provide custom deep neural network accelerators. This paper explores a DNN accelerator sharing system at the edge FPGA device, that serves various DNN applications from multiple end devices simultaneously. The proposed SharedDNN/PlanAhead policy exploits the regularity among requests for various DNN accelerators and determines which accelerator to allocate for each request and in what order to respond to the requests that achieve maximum responsiveness for a queue of acceleration requests. Our results show overall 2. 20x performance gain at best and utilization improvement by reducing up to 27% of DNN library usage while staying within the requests’ requirements and resource constraints.","PeriodicalId":375691,"journal":{"name":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Dynamic Sharing in Multi-accelerators of Neural Networks on an FPGA Edge Device\",\"authors\":\"Hsin-Yu Ting, Tootiya Giyahchi, A. A. Sani, E. Bozorgzadeh\",\"doi\":\"10.1109/ASAP49362.2020.00040\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Edge computing can potentially provide abundant processing resources for compute-intensive applications while bringing services close to end devices. With the increasing demands for computing acceleration at the edge, FPGAs have been deployed to provide custom deep neural network accelerators. This paper explores a DNN accelerator sharing system at the edge FPGA device, that serves various DNN applications from multiple end devices simultaneously. The proposed SharedDNN/PlanAhead policy exploits the regularity among requests for various DNN accelerators and determines which accelerator to allocate for each request and in what order to respond to the requests that achieve maximum responsiveness for a queue of acceleration requests. Our results show overall 2. 20x performance gain at best and utilization improvement by reducing up to 27% of DNN library usage while staying within the requests’ requirements and resource constraints.\",\"PeriodicalId\":375691,\"journal\":{\"name\":\"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASAP49362.2020.00040\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASAP49362.2020.00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Dynamic Sharing in Multi-accelerators of Neural Networks on an FPGA Edge Device
Edge computing can potentially provide abundant processing resources for compute-intensive applications while bringing services close to end devices. With the increasing demands for computing acceleration at the edge, FPGAs have been deployed to provide custom deep neural network accelerators. This paper explores a DNN accelerator sharing system at the edge FPGA device, that serves various DNN applications from multiple end devices simultaneously. The proposed SharedDNN/PlanAhead policy exploits the regularity among requests for various DNN accelerators and determines which accelerator to allocate for each request and in what order to respond to the requests that achieve maximum responsiveness for a queue of acceleration requests. Our results show overall 2. 20x performance gain at best and utilization improvement by reducing up to 27% of DNN library usage while staying within the requests’ requirements and resource constraints.