{"title":"Hardware-friendly model compression technique of DNN for edge computing","authors":"Xinyun Liu","doi":"10.1109/CDS52072.2021.00066","DOIUrl":null,"url":null,"abstract":"The research proposes a design methodology to compress the existing DNN models for low-cost edge devices. To reduce the computation complexity and memory cost, several novel model compression techniques are proposed. (1) A DNN model used to conduct image classification tasks is quantized into integer-based model for both the inference and training. 8-bit quantization is chosen in this work to balance the model training accuracy and cost. (2) A stochastic rounding scheme is implemented during the gradient backpropagation process to relieve the gradient diminishing risk. (3) To further reduce the training error caused by the gradient diminishing problem, a dynamic backpropagation algorithm is implemented. By dynamically scaling the magnitudes of gradient during the backpropagation, e.g. enlarging the magnitude of the gradient when it's too small to be quantized, it can effectively overcome the information loss due to the quantization error. As a result, such a DNN model for image classification is quantized into 8-bit model including training, which reduces the computation complexity by 8X and decreases the memory size by 6X Owing to the proposed dynamic backpropagation and stochastic training algorithms, the gradient diminishing issue during backpropagation is relieved. The training speed is reduced by 3X while classification error rates of state-of-art databases, e.g. ImageNet and CIFAR-10, are maintained similarly compared to the original model without quantization.","PeriodicalId":380426,"journal":{"name":"2021 2nd International Conference on Computing and Data Science (CDS)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 2nd International Conference on Computing and Data Science (CDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDS52072.2021.00066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The research proposes a design methodology to compress the existing DNN models for low-cost edge devices. To reduce the computation complexity and memory cost, several novel model compression techniques are proposed. (1) A DNN model used to conduct image classification tasks is quantized into integer-based model for both the inference and training. 8-bit quantization is chosen in this work to balance the model training accuracy and cost. (2) A stochastic rounding scheme is implemented during the gradient backpropagation process to relieve the gradient diminishing risk. (3) To further reduce the training error caused by the gradient diminishing problem, a dynamic backpropagation algorithm is implemented. By dynamically scaling the magnitudes of gradient during the backpropagation, e.g. enlarging the magnitude of the gradient when it's too small to be quantized, it can effectively overcome the information loss due to the quantization error. As a result, such a DNN model for image classification is quantized into 8-bit model including training, which reduces the computation complexity by 8X and decreases the memory size by 6X Owing to the proposed dynamic backpropagation and stochastic training algorithms, the gradient diminishing issue during backpropagation is relieved. The training speed is reduced by 3X while classification error rates of state-of-art databases, e.g. ImageNet and CIFAR-10, are maintained similarly compared to the original model without quantization.