Performance of Training Sparse Deep Neural Networks on GPUs

Jianzong Wang, Zhangcheng Huang, Lingwei Kong, Jing Xiao, Pengyu Wang, Lu Zhang, Chao Li
{"title":"Performance of Training Sparse Deep Neural Networks on GPUs","authors":"Jianzong Wang, Zhangcheng Huang, Lingwei Kong, Jing Xiao, Pengyu Wang, Lu Zhang, Chao Li","doi":"10.1109/HPEC.2019.8916506","DOIUrl":null,"url":null,"abstract":"Deep neural networks have revolutionized the field of machine learning by dramatically improving the state-of-the-art in various domains. The sizes of deep neural networks (DNNs) are rapidly outgrowing the capacity of hardware to fast store and train them. Over the past few decades, researches have explored the prospect of sparse DNNs before, during, and after training by pruning edges from the underlying topology. After the above operation, the generated neural network is known as a sparse neural network. More recent works have demonstrated the remarkable results that certain sparse DNNs can train to the same precision as dense DNNs at lower runtime and storage cost. Although existing methods ease the situation that high demand for computation resources severely hinders the deployment of large-scale DNNs in resource-constrained devices, DNNs can be trained at a faster speed and lower cost. In this work, we propose a Fine-tune Structured Sparsity Learning (FSSL) method to regularize the structures of DNNs and accelerate the training of DNNs. FSSL can: (1) learn a compact structure from large sparse DNN to reduce computation cost; (2) obtain a hardware-friendly to accelerate the DNNs evaluation efficiently. Experimental results of the training time and the compression rate show that superior performance and efficiency than the Matlab example code. These speedups are about twice speedups of non-structured sparsity.","PeriodicalId":184253,"journal":{"name":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC.2019.8916506","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Deep neural networks have revolutionized the field of machine learning by dramatically improving the state-of-the-art in various domains. The sizes of deep neural networks (DNNs) are rapidly outgrowing the capacity of hardware to fast store and train them. Over the past few decades, researches have explored the prospect of sparse DNNs before, during, and after training by pruning edges from the underlying topology. After the above operation, the generated neural network is known as a sparse neural network. More recent works have demonstrated the remarkable results that certain sparse DNNs can train to the same precision as dense DNNs at lower runtime and storage cost. Although existing methods ease the situation that high demand for computation resources severely hinders the deployment of large-scale DNNs in resource-constrained devices, DNNs can be trained at a faster speed and lower cost. In this work, we propose a Fine-tune Structured Sparsity Learning (FSSL) method to regularize the structures of DNNs and accelerate the training of DNNs. FSSL can: (1) learn a compact structure from large sparse DNN to reduce computation cost; (2) obtain a hardware-friendly to accelerate the DNNs evaluation efficiently. Experimental results of the training time and the compression rate show that superior performance and efficiency than the Matlab example code. These speedups are about twice speedups of non-structured sparsity.
稀疏深度神经网络在gpu上的训练性能
深度神经网络通过显着提高各个领域的最新技术,彻底改变了机器学习领域。深度神经网络(dnn)的规模正在迅速超过硬件快速存储和训练它们的能力。在过去的几十年里,研究人员通过从底层拓扑中修剪边缘,探索了稀疏dnn在训练前、训练中和训练后的前景。经过以上操作,生成的神经网络称为稀疏神经网络。最近的工作已经证明了一些显著的结果,即某些稀疏dnn可以在更低的运行时间和存储成本下训练到与密集dnn相同的精度。虽然现有的方法缓解了对计算资源的高需求严重阻碍大规模深度神经网络在资源受限设备上部署的情况,但可以以更快的速度和更低的成本训练深度神经网络。在这项工作中,我们提出了一种微调结构化稀疏学习(FSSL)方法来正则化dnn的结构并加速dnn的训练。FSSL可以:(1)从大型稀疏DNN中学习紧凑结构,降低计算成本;(2)获得一种硬件友好的方法,有效地加速深度神经网络的评估。训练时间和压缩率的实验结果表明,该算法的性能和效率都优于Matlab示例代码。这些加速大约是非结构化稀疏性的两倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信