Zhizhou Li, Justin A. Eichel, A. Mishra, Andrew Achkar, Sagar Naik
{"title":"基于稀疏卷积神经网络的cpu交通优化算法","authors":"Zhizhou Li, Justin A. Eichel, A. Mishra, Andrew Achkar, Sagar Naik","doi":"10.1109/CCECE.2017.7946788","DOIUrl":null,"url":null,"abstract":"Sparsity in the weights of deep convolutional networks presents a tremendous opportunity to reduce computational requirements. In order to optimize flow of traffic systems, any viable solution must be able to operate at real-time. Existing computation frameworks do not yet realize the full potential speedup afforded by sparse neural networks. Meanwhile, the power consumption for a GPU is too great for widely distributed, embedded optimization systems. Here, the authors propose a procedure for realizing the potential of sparse convolutional kernels on CPU. After preprocessing, a code-generator creates well-optimized and deployable code. Measuring the performance of the CPU-mode Tensorflow, the GPU-mode Tensorflow and this proposed solution on two different sparse convolutional neural networks shows that the proposed solution is 2 to 5 times faster than the CPU-mode Tensorflow and costs less power than the GPU-mode Tensorflow. The runtime of the proposed solution is 0.13s per 321 × 321 RGB image on a 98% sparse network, which is 5 times faster than the CPU-mode Tensorflow.","PeriodicalId":238720,"journal":{"name":"2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE)","volume":"45 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A CPU-based algorithm for traffic optimization based on sparse convolutional neural networks\",\"authors\":\"Zhizhou Li, Justin A. Eichel, A. Mishra, Andrew Achkar, Sagar Naik\",\"doi\":\"10.1109/CCECE.2017.7946788\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sparsity in the weights of deep convolutional networks presents a tremendous opportunity to reduce computational requirements. In order to optimize flow of traffic systems, any viable solution must be able to operate at real-time. Existing computation frameworks do not yet realize the full potential speedup afforded by sparse neural networks. Meanwhile, the power consumption for a GPU is too great for widely distributed, embedded optimization systems. Here, the authors propose a procedure for realizing the potential of sparse convolutional kernels on CPU. After preprocessing, a code-generator creates well-optimized and deployable code. Measuring the performance of the CPU-mode Tensorflow, the GPU-mode Tensorflow and this proposed solution on two different sparse convolutional neural networks shows that the proposed solution is 2 to 5 times faster than the CPU-mode Tensorflow and costs less power than the GPU-mode Tensorflow. The runtime of the proposed solution is 0.13s per 321 × 321 RGB image on a 98% sparse network, which is 5 times faster than the CPU-mode Tensorflow.\",\"PeriodicalId\":238720,\"journal\":{\"name\":\"2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE)\",\"volume\":\"45 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCECE.2017.7946788\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCECE.2017.7946788","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A CPU-based algorithm for traffic optimization based on sparse convolutional neural networks
Sparsity in the weights of deep convolutional networks presents a tremendous opportunity to reduce computational requirements. In order to optimize flow of traffic systems, any viable solution must be able to operate at real-time. Existing computation frameworks do not yet realize the full potential speedup afforded by sparse neural networks. Meanwhile, the power consumption for a GPU is too great for widely distributed, embedded optimization systems. Here, the authors propose a procedure for realizing the potential of sparse convolutional kernels on CPU. After preprocessing, a code-generator creates well-optimized and deployable code. Measuring the performance of the CPU-mode Tensorflow, the GPU-mode Tensorflow and this proposed solution on two different sparse convolutional neural networks shows that the proposed solution is 2 to 5 times faster than the CPU-mode Tensorflow and costs less power than the GPU-mode Tensorflow. The runtime of the proposed solution is 0.13s per 321 × 321 RGB image on a 98% sparse network, which is 5 times faster than the CPU-mode Tensorflow.