Yang-Ming Yeh, Jennifer Shueh-Inn Hu, Yen-Yu Lin, Yi-Chang Lu
{"title":"Compressing DNN Parameters for Model Loading Time Reduction","authors":"Yang-Ming Yeh, Jennifer Shueh-Inn Hu, Yen-Yu Lin, Yi-Chang Lu","doi":"10.1109/icce-asia46551.2019.8942192","DOIUrl":null,"url":null,"abstract":"Deep neural network (DNN) has been applied to a variety of computer vision tasks these days. However, DNN often suffers from its enormous execution time even with the aid of GPU. In this paper, we argue that the bandwidth bottleneck between GPU and GDRAM has to be addressed. To reduce loading time, we propose a DNN acceleration approach which compresses DNN parameters before loading model information to GPU and performs decompressing on GPU. Using JPEG compression as an example, the loss of the test accuracy can be kept within 4%, while an 8 × parameter-size reduction is achieved for VGG16.","PeriodicalId":117814,"journal":{"name":"2019 IEEE International Conference on Consumer Electronics - Asia (ICCE-Asia)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Consumer Electronics - Asia (ICCE-Asia)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icce-asia46551.2019.8942192","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Deep neural network (DNN) has been applied to a variety of computer vision tasks these days. However, DNN often suffers from its enormous execution time even with the aid of GPU. In this paper, we argue that the bandwidth bottleneck between GPU and GDRAM has to be addressed. To reduce loading time, we propose a DNN acceleration approach which compresses DNN parameters before loading model information to GPU and performs decompressing on GPU. Using JPEG compression as an example, the loss of the test accuracy can be kept within 4%, while an 8 × parameter-size reduction is achieved for VGG16.