{"title":"在异构架构上部署和优化卷积神经网络","authors":"Junning Jiang, Liang Cai, Feng Dong, Kehua Yu, Ke Chen, Wei Qu, Jianfei Jiang","doi":"10.1109/ASICON47005.2019.8983456","DOIUrl":null,"url":null,"abstract":"Deploying convolutional neural networks to hardware platform can accelerate the inference and is critical for the application of artificial intelligence. In this paper, we design an FPGA+CPU heterogeneous platform to accelerate CNNs. Dataflow optimizing, accelerator structure optimization and compute precision optimization are proposed to improve performance of the accelerating platform. Different ResNet and MobileNet networks are successfully deployed on the platform. By applying the proposed dataflow optimization and precision optimization, the performance improvement of inference is 3.25× on ResNet. By applying the accelerator structure optimization and precision optimization, the performance improvement of inference is 3.63× on MobileNet.","PeriodicalId":319342,"journal":{"name":"2019 IEEE 13th International Conference on ASIC (ASICON)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Deploying and Optimizing Convolutional Neural Networks on Heterogeneous Architecture\",\"authors\":\"Junning Jiang, Liang Cai, Feng Dong, Kehua Yu, Ke Chen, Wei Qu, Jianfei Jiang\",\"doi\":\"10.1109/ASICON47005.2019.8983456\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deploying convolutional neural networks to hardware platform can accelerate the inference and is critical for the application of artificial intelligence. In this paper, we design an FPGA+CPU heterogeneous platform to accelerate CNNs. Dataflow optimizing, accelerator structure optimization and compute precision optimization are proposed to improve performance of the accelerating platform. Different ResNet and MobileNet networks are successfully deployed on the platform. By applying the proposed dataflow optimization and precision optimization, the performance improvement of inference is 3.25× on ResNet. By applying the accelerator structure optimization and precision optimization, the performance improvement of inference is 3.63× on MobileNet.\",\"PeriodicalId\":319342,\"journal\":{\"name\":\"2019 IEEE 13th International Conference on ASIC (ASICON)\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 13th International Conference on ASIC (ASICON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASICON47005.2019.8983456\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 13th International Conference on ASIC (ASICON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASICON47005.2019.8983456","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deploying and Optimizing Convolutional Neural Networks on Heterogeneous Architecture
Deploying convolutional neural networks to hardware platform can accelerate the inference and is critical for the application of artificial intelligence. In this paper, we design an FPGA+CPU heterogeneous platform to accelerate CNNs. Dataflow optimizing, accelerator structure optimization and compute precision optimization are proposed to improve performance of the accelerating platform. Different ResNet and MobileNet networks are successfully deployed on the platform. By applying the proposed dataflow optimization and precision optimization, the performance improvement of inference is 3.25× on ResNet. By applying the accelerator structure optimization and precision optimization, the performance improvement of inference is 3.63× on MobileNet.