Performance of Training Sparse Deep Neural Networks on GPUs

2019 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2019-09-01 DOI:10.1109/HPEC.2019.8916506

Jianzong Wang, Zhangcheng Huang, Lingwei Kong, Jing Xiao, Pengyu Wang, Lu Zhang, Chao Li

{"title":"Performance of Training Sparse Deep Neural Networks on GPUs","authors":"Jianzong Wang, Zhangcheng Huang, Lingwei Kong, Jing Xiao, Pengyu Wang, Lu Zhang, Chao Li","doi":"10.1109/HPEC.2019.8916506","DOIUrl":null,"url":null,"abstract":"Deep neural networks have revolutionized the field of machine learning by dramatically improving the state-of-the-art in various domains. The sizes of deep neural networks (DNNs) are rapidly outgrowing the capacity of hardware to fast store and train them. Over the past few decades, researches have explored the prospect of sparse DNNs before, during, and after training by pruning edges from the underlying topology. After the above operation, the generated neural network is known as a sparse neural network. More recent works have demonstrated the remarkable results that certain sparse DNNs can train to the same precision as dense DNNs at lower runtime and storage cost. Although existing methods ease the situation that high demand for computation resources severely hinders the deployment of large-scale DNNs in resource-constrained devices, DNNs can be trained at a faster speed and lower cost. In this work, we propose a Fine-tune Structured Sparsity Learning (FSSL) method to regularize the structures of DNNs and accelerate the training of DNNs. FSSL can: (1) learn a compact structure from large sparse DNN to reduce computation cost; (2) obtain a hardware-friendly to accelerate the DNNs evaluation efficiently. Experimental results of the training time and the compression rate show that superior performance and efficiency than the Matlab example code. These speedups are about twice speedups of non-structured sparsity.","PeriodicalId":184253,"journal":{"name":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC.2019.8916506","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Deep neural networks have revolutionized the field of machine learning by dramatically improving the state-of-the-art in various domains. The sizes of deep neural networks (DNNs) are rapidly outgrowing the capacity of hardware to fast store and train them. Over the past few decades, researches have explored the prospect of sparse DNNs before, during, and after training by pruning edges from the underlying topology. After the above operation, the generated neural network is known as a sparse neural network. More recent works have demonstrated the remarkable results that certain sparse DNNs can train to the same precision as dense DNNs at lower runtime and storage cost. Although existing methods ease the situation that high demand for computation resources severely hinders the deployment of large-scale DNNs in resource-constrained devices, DNNs can be trained at a faster speed and lower cost. In this work, we propose a Fine-tune Structured Sparsity Learning (FSSL) method to regularize the structures of DNNs and accelerate the training of DNNs. FSSL can: (1) learn a compact structure from large sparse DNN to reduce computation cost; (2) obtain a hardware-friendly to accelerate the DNNs evaluation efficiently. Experimental results of the training time and the compression rate show that superior performance and efficiency than the Matlab example code. These speedups are about twice speedups of non-structured sparsity.

查看原文本刊更多论文

稀疏深度神经网络在gpu上的训练性能

深度神经网络通过显着提高各个领域的最新技术，彻底改变了机器学习领域。深度神经网络(dnn)的规模正在迅速超过硬件快速存储和训练它们的能力。在过去的几十年里，研究人员通过从底层拓扑中修剪边缘，探索了稀疏dnn在训练前、训练中和训练后的前景。经过以上操作，生成的神经网络称为稀疏神经网络。最近的工作已经证明了一些显著的结果，即某些稀疏dnn可以在更低的运行时间和存储成本下训练到与密集dnn相同的精度。虽然现有的方法缓解了对计算资源的高需求严重阻碍大规模深度神经网络在资源受限设备上部署的情况，但可以以更快的速度和更低的成本训练深度神经网络。在这项工作中，我们提出了一种微调结构化稀疏学习(FSSL)方法来正则化dnn的结构并加速dnn的训练。FSSL可以:(1)从大型稀疏DNN中学习紧凑结构，降低计算成本;(2)获得一种硬件友好的方法，有效地加速深度神经网络的评估。训练时间和压缩率的实验结果表明，该算法的性能和效率都优于Matlab示例代码。这些加速大约是非结构化稀疏性的两倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE High Performance Extreme Computing Conference (HPEC)

自引率

0.00%

发文量