基于fpga的嵌入式分布式系统加速DCNN实例研究

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2019-05-20 DOI:10.1109/IPDPSW.2019.00025

Anna Maria Nestorov, Alberto Scolari, Enrico Reggiani, Luca Stornaiuolo, M. Santambrogio

{"title":"基于fpga的嵌入式分布式系统加速DCNN实例研究","authors":"Anna Maria Nestorov, Alberto Scolari, Enrico Reggiani, Luca Stornaiuolo, M. Santambrogio","doi":"10.1109/IPDPSW.2019.00025","DOIUrl":null,"url":null,"abstract":"Face Detection (FD) recently became the base of multiple applications requiring low latency but also with limited resources and energy budgets. Deep Convolutional Neural Networks (DCNNs) are especially accurate in FD, but latency requirements and energy budgets call for Field Programmable Gate Arrays (FPGAs)-based solutions, trading flexibility and efficiency. Nonetheless, the offer of FPGAs solutions is limited and different chips often require expensive re-design phases, while developers desire solutions whose resources can scale proportionally to the demands. Therefore, this work presents an FD solution based on a DCNN on a distributed, embedded system with FPGAs, proposing a general approach to reduce the DCNN size and to design its FPGA cores and investigating its accuracy, performance, and energy efficiency.","PeriodicalId":292054,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Case Study for an Accelerated DCNN on FPGA-Based Embedded Distributed System\",\"authors\":\"Anna Maria Nestorov, Alberto Scolari, Enrico Reggiani, Luca Stornaiuolo, M. Santambrogio\",\"doi\":\"10.1109/IPDPSW.2019.00025\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Face Detection (FD) recently became the base of multiple applications requiring low latency but also with limited resources and energy budgets. Deep Convolutional Neural Networks (DCNNs) are especially accurate in FD, but latency requirements and energy budgets call for Field Programmable Gate Arrays (FPGAs)-based solutions, trading flexibility and efficiency. Nonetheless, the offer of FPGAs solutions is limited and different chips often require expensive re-design phases, while developers desire solutions whose resources can scale proportionally to the demands. Therefore, this work presents an FD solution based on a DCNN on a distributed, embedded system with FPGAs, proposing a general approach to reduce the DCNN size and to design its FPGA cores and investigating its accuracy, performance, and energy efficiency.\",\"PeriodicalId\":292054,\"journal\":{\"name\":\"2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)\",\"volume\":\"81 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPSW.2019.00025\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2019.00025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

人脸检测(FD)最近成为许多需要低延迟但资源和能源预算有限的应用的基础。深度卷积神经网络(DCNNs)在FD中尤其准确，但延迟要求和能量预算要求基于现场可编程门阵列(fpga)的解决方案，交易灵活性和效率。尽管如此，fpga解决方案的提供是有限的，不同的芯片通常需要昂贵的重新设计阶段，而开发人员希望解决方案的资源可以按比例扩展需求。因此，本研究提出了一种基于DCNN的FPGA分布式嵌入式系统的FD解决方案，提出了一种减小DCNN尺寸和设计其FPGA内核的通用方法，并研究了其精度、性能和能效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Case Study for an Accelerated DCNN on FPGA-Based Embedded Distributed System

Face Detection (FD) recently became the base of multiple applications requiring low latency but also with limited resources and energy budgets. Deep Convolutional Neural Networks (DCNNs) are especially accurate in FD, but latency requirements and energy budgets call for Field Programmable Gate Arrays (FPGAs)-based solutions, trading flexibility and efficiency. Nonetheless, the offer of FPGAs solutions is limited and different chips often require expensive re-design phases, while developers desire solutions whose resources can scale proportionally to the demands. Therefore, this work presents an FD solution based on a DCNN on a distributed, embedded system with FPGAs, proposing a general approach to reduce the DCNN size and to design its FPGA cores and investigating its accuracy, performance, and energy efficiency.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

自引率

0.00%

发文量