{"title":"TB-DNN:一种高精度的薄二值化深度神经网络","authors":"Jie Wang, Xi Jin, Wei Wu","doi":"10.23919/ICACT48636.2020.9061291","DOIUrl":null,"url":null,"abstract":"Deep neural networks (DNNs) have been widely used in many artificial intelligence (AI) applications. However, due to the huge demand for computing and storage resources as well as the high power consumption, deploying DNN models on embedded devices is full of challenges. Recent works have shown that DNN models can be compressed by removing their inner redundancy without obviously performance decay. In this work, we propose a two stage pipeline way to compress the ResNet-14 model and test it on CIFAR-10 and SVHN dataset respectively. Firstly, we use a filter level pruning method to remove the less important filters with different compression rate, and a considerable computation costs are reduced. Secondly, we binarize the pruned model to further reduce the model size and computational complexity. The training results show that we achieve 87.7% accuracy with only 1.86Mb model size on CIFAR-10 and 96.2% accuracy with 1.34Mb on SVHN. Compared to the original model, we have 57% to 68% FLOPs reduction and 45.6× to 63.1× model size compression at the cost of roughly 4% accuracy drop. Finally, we implement the thin binarized ResNet-14 model on the Xilinx KC705 board with a shared, flexible accumulator, which can save 46.8% logic resources. And the entire network parameters are store into on-chip RAM, which can greatly reduce the energy consumption and memory overhead caused by off-chip accesses. The experimental results show that on CIFAR-10 dataset, we achieve an overall performance of 1200 FPS, energy efficiency of 571 FPS/W, which denote 2.3× and 3.6× improvements over the most recent work.","PeriodicalId":296763,"journal":{"name":"2020 22nd International Conference on Advanced Communication Technology (ICACT)","volume":"109 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"TB-DNN: A Thin Binarized Deep Neural Network with High Accuracy\",\"authors\":\"Jie Wang, Xi Jin, Wei Wu\",\"doi\":\"10.23919/ICACT48636.2020.9061291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural networks (DNNs) have been widely used in many artificial intelligence (AI) applications. However, due to the huge demand for computing and storage resources as well as the high power consumption, deploying DNN models on embedded devices is full of challenges. Recent works have shown that DNN models can be compressed by removing their inner redundancy without obviously performance decay. In this work, we propose a two stage pipeline way to compress the ResNet-14 model and test it on CIFAR-10 and SVHN dataset respectively. Firstly, we use a filter level pruning method to remove the less important filters with different compression rate, and a considerable computation costs are reduced. Secondly, we binarize the pruned model to further reduce the model size and computational complexity. The training results show that we achieve 87.7% accuracy with only 1.86Mb model size on CIFAR-10 and 96.2% accuracy with 1.34Mb on SVHN. Compared to the original model, we have 57% to 68% FLOPs reduction and 45.6× to 63.1× model size compression at the cost of roughly 4% accuracy drop. Finally, we implement the thin binarized ResNet-14 model on the Xilinx KC705 board with a shared, flexible accumulator, which can save 46.8% logic resources. And the entire network parameters are store into on-chip RAM, which can greatly reduce the energy consumption and memory overhead caused by off-chip accesses. The experimental results show that on CIFAR-10 dataset, we achieve an overall performance of 1200 FPS, energy efficiency of 571 FPS/W, which denote 2.3× and 3.6× improvements over the most recent work.\",\"PeriodicalId\":296763,\"journal\":{\"name\":\"2020 22nd International Conference on Advanced Communication Technology (ICACT)\",\"volume\":\"109 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 22nd International Conference on Advanced Communication Technology (ICACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/ICACT48636.2020.9061291\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 22nd International Conference on Advanced Communication Technology (ICACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ICACT48636.2020.9061291","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
TB-DNN: A Thin Binarized Deep Neural Network with High Accuracy
Deep neural networks (DNNs) have been widely used in many artificial intelligence (AI) applications. However, due to the huge demand for computing and storage resources as well as the high power consumption, deploying DNN models on embedded devices is full of challenges. Recent works have shown that DNN models can be compressed by removing their inner redundancy without obviously performance decay. In this work, we propose a two stage pipeline way to compress the ResNet-14 model and test it on CIFAR-10 and SVHN dataset respectively. Firstly, we use a filter level pruning method to remove the less important filters with different compression rate, and a considerable computation costs are reduced. Secondly, we binarize the pruned model to further reduce the model size and computational complexity. The training results show that we achieve 87.7% accuracy with only 1.86Mb model size on CIFAR-10 and 96.2% accuracy with 1.34Mb on SVHN. Compared to the original model, we have 57% to 68% FLOPs reduction and 45.6× to 63.1× model size compression at the cost of roughly 4% accuracy drop. Finally, we implement the thin binarized ResNet-14 model on the Xilinx KC705 board with a shared, flexible accumulator, which can save 46.8% logic resources. And the entire network parameters are store into on-chip RAM, which can greatly reduce the energy consumption and memory overhead caused by off-chip accesses. The experimental results show that on CIFAR-10 dataset, we achieve an overall performance of 1200 FPS, energy efficiency of 571 FPS/W, which denote 2.3× and 3.6× improvements over the most recent work.