Training Deep Neural Networks with Constrained Learning Parameters

2020 International Conference on Rebooting Computing (ICRC) Pub Date : 2020-09-01 DOI:10.1109/ICRC2020.2020.00018

Prasanna Date, C. Carothers, J. Mitchell, J. Hendler, M. Magdon-Ismail

{"title":"Training Deep Neural Networks with Constrained Learning Parameters","authors":"Prasanna Date, C. Carothers, J. Mitchell, J. Hendler, M. Magdon-Ismail","doi":"10.1109/ICRC2020.2020.00018","DOIUrl":null,"url":null,"abstract":"Today's deep learning models are primarily trained on CPUs and GPUs. Although these models tend to have low error, they consume high power and utilize large amount of memory owing to double precision floating point learning parameters. Beyond the Moore's law, a significant portion of deep learning tasks would run on edge computing systems, which will form an indispensable part of the entire computation fabric. Subsequently, training deep learning models for such systems will have to be tailored and adopted to generate models that have the following desirable characteristics: low error, low memory, and low power. We believe that deep neural networks (DNNs), where learning parameters are constrained to have a set of finite discrete values, running on neuromorphic computing systems would be instrumental for intelligent edge computing systems having these desirable characteristics. To this extent, we propose the Combinatorial Neural Network Training Algorithm (CoNNTrA), that leverages a coordinate gradient descent-based approach for training deep learning models with finite discrete learning parameters. Next, we elaborate on the theoretical underpinnings and evaluate the computational complexity of CoNNTrA. As a proof of concept, we use CoNNTrA to train deep learning models with ternary learning parameters on the MNIST, Iris and ImageNet data sets and compare their performance to the same models trained using Backpropagation. We use following performance metrics for the comparison: (i) Training error; (ii) Validation error; (iii) Memory usage; and (iv) Training time. Our results indicate that CoNNTTA models use 32 × less memory and have errors at par with the Backpropagation models.","PeriodicalId":320580,"journal":{"name":"2020 International Conference on Rebooting Computing (ICRC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Rebooting Computing (ICRC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRC2020.2020.00018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Today's deep learning models are primarily trained on CPUs and GPUs. Although these models tend to have low error, they consume high power and utilize large amount of memory owing to double precision floating point learning parameters. Beyond the Moore's law, a significant portion of deep learning tasks would run on edge computing systems, which will form an indispensable part of the entire computation fabric. Subsequently, training deep learning models for such systems will have to be tailored and adopted to generate models that have the following desirable characteristics: low error, low memory, and low power. We believe that deep neural networks (DNNs), where learning parameters are constrained to have a set of finite discrete values, running on neuromorphic computing systems would be instrumental for intelligent edge computing systems having these desirable characteristics. To this extent, we propose the Combinatorial Neural Network Training Algorithm (CoNNTrA), that leverages a coordinate gradient descent-based approach for training deep learning models with finite discrete learning parameters. Next, we elaborate on the theoretical underpinnings and evaluate the computational complexity of CoNNTrA. As a proof of concept, we use CoNNTrA to train deep learning models with ternary learning parameters on the MNIST, Iris and ImageNet data sets and compare their performance to the same models trained using Backpropagation. We use following performance metrics for the comparison: (i) Training error; (ii) Validation error; (iii) Memory usage; and (iv) Training time. Our results indicate that CoNNTTA models use 32 × less memory and have errors at par with the Backpropagation models.

查看原文本刊更多论文

基于约束学习参数的深度神经网络训练

今天的深度学习模型主要是在cpu和gpu上训练的。虽然这些模型误差较小，但由于采用双精度浮点学习参数，其功耗高，占用大量内存。除了摩尔定律之外，深度学习任务的很大一部分将在边缘计算系统上运行，这将成为整个计算结构中不可或缺的一部分。随后，必须对此类系统的深度学习模型进行定制和采用，以生成具有以下理想特征的模型:低错误、低内存和低功耗。我们认为，深度神经网络(dnn)的学习参数被限制为具有一组有限的离散值，在神经形态计算系统上运行将有助于具有这些理想特征的智能边缘计算系统。在这种程度上，我们提出了组合神经网络训练算法(CoNNTrA)，它利用基于坐标梯度下降的方法来训练具有有限离散学习参数的深度学习模型。接下来，我们详细阐述了CoNNTrA的理论基础并评估了其计算复杂度。作为概念验证，我们使用CoNNTrA在MNIST、Iris和ImageNet数据集上训练具有三元学习参数的深度学习模型，并将其性能与使用反向传播训练的相同模型进行比较。我们使用以下性能指标进行比较:(i)训练误差;(ii)验证误差;内存使用情况;(四)培训时间。我们的结果表明，CoNNTTA模型使用的内存比反向传播模型少32倍，并且误差与反向传播模型相当。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 International Conference on Rebooting Computing (ICRC)

自引率

0.00%

发文量