Memory-efficient training of binarized neural networks on the edge

Proceedings of the 59th ACM/IEEE Design Automation Conference Pub Date : 2022-07-10 DOI:10.1145/3489517.3530496

Mikail Yayla, Jian-Jia Chen

{"title":"Memory-efficient training of binarized neural networks on the edge","authors":"Mikail Yayla, Jian-Jia Chen","doi":"10.1145/3489517.3530496","DOIUrl":null,"url":null,"abstract":"A visionary computing paradigm is to train resource efficient neural networks on the edge using dedicated low-power accelerators instead of cloud infrastructures, eliminating communication overheads and privacy concerns. One promising resource-efficient approach for inference is binarized neural networks (BNNs), which binarize parameters and activations. However, training BNNs remains resource demanding. State-of-the-art BNN training methods, such as the binary optimizer (Bop), require to store and update a large number of momentum values in the floating point (FP) format. In this work, we focus on memory-efficient FP encodings for the momentum values in Bop. To achieve this, we first investigate the impact of arbitrary FP encodings. When the FP format is not properly chosen, we prove that the updates of the momentum values can be lost and the quality of training is therefore dropped. With the insights, we formulate a metric to determine the number of unchanged momentum values in a training iteration due to the FP encoding. Based on the metric, we develop an algorithm to find FP encodings that are more memory-efficient than the standard FP encodings. In our experiments, the memory usage in BNN training is decreased by factors 2.47x, 2.43x, 2.04x, depending on the BNN model, with minimal accuracy cost (smaller than 1%) compared to using 32-bit FP encoding.","PeriodicalId":373005,"journal":{"name":"Proceedings of the 59th ACM/IEEE Design Automation Conference","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 59th ACM/IEEE Design Automation Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3489517.3530496","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

A visionary computing paradigm is to train resource efficient neural networks on the edge using dedicated low-power accelerators instead of cloud infrastructures, eliminating communication overheads and privacy concerns. One promising resource-efficient approach for inference is binarized neural networks (BNNs), which binarize parameters and activations. However, training BNNs remains resource demanding. State-of-the-art BNN training methods, such as the binary optimizer (Bop), require to store and update a large number of momentum values in the floating point (FP) format. In this work, we focus on memory-efficient FP encodings for the momentum values in Bop. To achieve this, we first investigate the impact of arbitrary FP encodings. When the FP format is not properly chosen, we prove that the updates of the momentum values can be lost and the quality of training is therefore dropped. With the insights, we formulate a metric to determine the number of unchanged momentum values in a training iteration due to the FP encoding. Based on the metric, we develop an algorithm to find FP encodings that are more memory-efficient than the standard FP encodings. In our experiments, the memory usage in BNN training is decreased by factors 2.47x, 2.43x, 2.04x, depending on the BNN model, with minimal accuracy cost (smaller than 1%) compared to using 32-bit FP encoding.

查看原文本刊更多论文

边缘上二值化神经网络的记忆效率训练

一个有远见的计算范式是在边缘使用专用的低功耗加速器而不是云基础设施来训练资源高效的神经网络，从而消除通信开销和隐私问题。二值化神经网络(bnn)是一种很有前途的资源高效推理方法，它对参数和激活进行二值化。然而，培训bnn仍然需要资源。最先进的BNN训练方法，如二进制优化器(Bop)，需要以浮点(FP)格式存储和更新大量动量值。在这项工作中，我们重点研究了Bop中动量值的内存高效FP编码。为了实现这一点，我们首先研究任意FP编码的影响。当FP格式选择不当时，我们证明动量值的更新可能会丢失，从而降低训练质量。有了这些见解，我们制定了一个度量来确定由于FP编码而导致的训练迭代中未改变的动量值的数量。基于这个度量，我们开发了一种算法来寻找比标准FP编码更高效的内存编码。在我们的实验中，与使用32位FP编码相比，BNN训练中的内存使用量减少了2.47倍，2.43倍和2.04倍，具体取决于BNN模型，并且精度成本最小(小于1%)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 59th ACM/IEEE Design Automation Conference

自引率

0.00%

发文量