{"title":"加速CNN计算:量化调整和网络大小调整","authors":"Alexandre Vieira, F. Pratas, L. Sousa, A. Ilic","doi":"10.1145/3295816.3295820","DOIUrl":null,"url":null,"abstract":"The interest in developing cognitive aware systems, specially for vision applications based on artificial neural networks, has grown exponentially in the last years. While high performance systems are key for the success of current Convolutional Neural Network (CNN) implementations, there is a trend to bring these capabilities to embedded real-time systems. This work contributes to tackle this challenge by exploring CNNs design space. Namely, it combines parameter quantisation techniques with a proposed set of CNN architectural transformations to reduce resource and execution time costs on Field Programmable Gate Array (FPGA) devices while maintaining high classification accuracy. An hardware mapping methodology is also proposed for deploying resource constrained CNNs into a reconfigurable platform for efficient algorithm acceleration. The proposed transformations reduce accuracy loss due to quantization by 44% in average. Also, analysis of the performance results obtained in a Central Processing Unit (CPU)+FPGA platform show up to 50% execution time reduction when compared with a state-of-the-art implementation.","PeriodicalId":280329,"journal":{"name":"ANDARE '18","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accelerating CNN computation: quantisation tuning and network resizing\",\"authors\":\"Alexandre Vieira, F. Pratas, L. Sousa, A. Ilic\",\"doi\":\"10.1145/3295816.3295820\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The interest in developing cognitive aware systems, specially for vision applications based on artificial neural networks, has grown exponentially in the last years. While high performance systems are key for the success of current Convolutional Neural Network (CNN) implementations, there is a trend to bring these capabilities to embedded real-time systems. This work contributes to tackle this challenge by exploring CNNs design space. Namely, it combines parameter quantisation techniques with a proposed set of CNN architectural transformations to reduce resource and execution time costs on Field Programmable Gate Array (FPGA) devices while maintaining high classification accuracy. An hardware mapping methodology is also proposed for deploying resource constrained CNNs into a reconfigurable platform for efficient algorithm acceleration. The proposed transformations reduce accuracy loss due to quantization by 44% in average. Also, analysis of the performance results obtained in a Central Processing Unit (CPU)+FPGA platform show up to 50% execution time reduction when compared with a state-of-the-art implementation.\",\"PeriodicalId\":280329,\"journal\":{\"name\":\"ANDARE '18\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ANDARE '18\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3295816.3295820\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ANDARE '18","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3295816.3295820","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Accelerating CNN computation: quantisation tuning and network resizing
The interest in developing cognitive aware systems, specially for vision applications based on artificial neural networks, has grown exponentially in the last years. While high performance systems are key for the success of current Convolutional Neural Network (CNN) implementations, there is a trend to bring these capabilities to embedded real-time systems. This work contributes to tackle this challenge by exploring CNNs design space. Namely, it combines parameter quantisation techniques with a proposed set of CNN architectural transformations to reduce resource and execution time costs on Field Programmable Gate Array (FPGA) devices while maintaining high classification accuracy. An hardware mapping methodology is also proposed for deploying resource constrained CNNs into a reconfigurable platform for efficient algorithm acceleration. The proposed transformations reduce accuracy loss due to quantization by 44% in average. Also, analysis of the performance results obtained in a Central Processing Unit (CPU)+FPGA platform show up to 50% execution time reduction when compared with a state-of-the-art implementation.