基于稀疏卷积神经网络的cpu交通优化算法

2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE) Pub Date : 2017-04-01 DOI:10.1109/CCECE.2017.7946788

Zhizhou Li, Justin A. Eichel, A. Mishra, Andrew Achkar, Sagar Naik

{"title":"基于稀疏卷积神经网络的cpu交通优化算法","authors":"Zhizhou Li, Justin A. Eichel, A. Mishra, Andrew Achkar, Sagar Naik","doi":"10.1109/CCECE.2017.7946788","DOIUrl":null,"url":null,"abstract":"Sparsity in the weights of deep convolutional networks presents a tremendous opportunity to reduce computational requirements. In order to optimize flow of traffic systems, any viable solution must be able to operate at real-time. Existing computation frameworks do not yet realize the full potential speedup afforded by sparse neural networks. Meanwhile, the power consumption for a GPU is too great for widely distributed, embedded optimization systems. Here, the authors propose a procedure for realizing the potential of sparse convolutional kernels on CPU. After preprocessing, a code-generator creates well-optimized and deployable code. Measuring the performance of the CPU-mode Tensorflow, the GPU-mode Tensorflow and this proposed solution on two different sparse convolutional neural networks shows that the proposed solution is 2 to 5 times faster than the CPU-mode Tensorflow and costs less power than the GPU-mode Tensorflow. The runtime of the proposed solution is 0.13s per 321 × 321 RGB image on a 98% sparse network, which is 5 times faster than the CPU-mode Tensorflow.","PeriodicalId":238720,"journal":{"name":"2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE)","volume":"45 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A CPU-based algorithm for traffic optimization based on sparse convolutional neural networks\",\"authors\":\"Zhizhou Li, Justin A. Eichel, A. Mishra, Andrew Achkar, Sagar Naik\",\"doi\":\"10.1109/CCECE.2017.7946788\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sparsity in the weights of deep convolutional networks presents a tremendous opportunity to reduce computational requirements. In order to optimize flow of traffic systems, any viable solution must be able to operate at real-time. Existing computation frameworks do not yet realize the full potential speedup afforded by sparse neural networks. Meanwhile, the power consumption for a GPU is too great for widely distributed, embedded optimization systems. Here, the authors propose a procedure for realizing the potential of sparse convolutional kernels on CPU. After preprocessing, a code-generator creates well-optimized and deployable code. Measuring the performance of the CPU-mode Tensorflow, the GPU-mode Tensorflow and this proposed solution on two different sparse convolutional neural networks shows that the proposed solution is 2 to 5 times faster than the CPU-mode Tensorflow and costs less power than the GPU-mode Tensorflow. The runtime of the proposed solution is 0.13s per 321 × 321 RGB image on a 98% sparse network, which is 5 times faster than the CPU-mode Tensorflow.\",\"PeriodicalId\":238720,\"journal\":{\"name\":\"2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE)\",\"volume\":\"45 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCECE.2017.7946788\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCECE.2017.7946788","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

深度卷积网络权值的稀疏性为减少计算需求提供了巨大的机会。为了优化交通系统的流量，任何可行的解决方案都必须能够实时运行。现有的计算框架还没有充分实现稀疏神经网络提供的潜在加速。同时，对于广泛分布的嵌入式优化系统来说，GPU的功耗太大了。在这里，作者提出了一个在CPU上实现稀疏卷积核的潜力的过程。经过预处理后，代码生成器将创建经过良好优化且可部署的代码。在两种不同的稀疏卷积神经网络上对cpu模式的Tensorflow、gpu模式的Tensorflow和本文提出的解决方案的性能进行了测试，结果表明，本文提出的解决方案比cpu模式的Tensorflow快2到5倍，并且功耗比gpu模式的Tensorflow低。在98%稀疏网络上，该解决方案的运行时间为每321 × 321 RGB图像0.13秒，比cpu模式的Tensorflow快5倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A CPU-based algorithm for traffic optimization based on sparse convolutional neural networks

Sparsity in the weights of deep convolutional networks presents a tremendous opportunity to reduce computational requirements. In order to optimize flow of traffic systems, any viable solution must be able to operate at real-time. Existing computation frameworks do not yet realize the full potential speedup afforded by sparse neural networks. Meanwhile, the power consumption for a GPU is too great for widely distributed, embedded optimization systems. Here, the authors propose a procedure for realizing the potential of sparse convolutional kernels on CPU. After preprocessing, a code-generator creates well-optimized and deployable code. Measuring the performance of the CPU-mode Tensorflow, the GPU-mode Tensorflow and this proposed solution on two different sparse convolutional neural networks shows that the proposed solution is 2 to 5 times faster than the CPU-mode Tensorflow and costs less power than the GPU-mode Tensorflow. The runtime of the proposed solution is 0.13s per 321 × 321 RGB image on a 98% sparse network, which is 5 times faster than the CPU-mode Tensorflow.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE)

自引率

0.00%

发文量