Adaptive Computation Reuse for Energy-Efficient Training of Deep Neural Networks

ACM Trans. Embed. Comput. Syst. Pub Date : 2021-11-30 DOI:10.1145/3487025

Jason Servais, E. Atoofian

{"title":"Adaptive Computation Reuse for Energy-Efficient Training of Deep Neural Networks","authors":"Jason Servais, E. Atoofian","doi":"10.1145/3487025","DOIUrl":null,"url":null,"abstract":"In recent years, Deep Neural Networks (DNNs) have been deployed into a diverse set of applications from voice recognition to scene generation mostly due to their high-accuracy. DNNs are known to be computationally intensive applications, requiring a significant power budget. There have been a large number of investigations into energy-efficiency of DNNs. However, most of them primarily focused on inference while training of DNNs has received little attention.\n This work proposes an adaptive technique to identify and avoid redundant computations during the training of DNNs. Elements of activations exhibit a high degree of similarity, causing inputs and outputs of layers of neural networks to perform redundant computations. Based on this observation, we propose Adaptive Computation Reuse for Tensor Cores (ACRTC) where results of previous arithmetic operations are used to avoid redundant computations. ACRTC is an architectural technique, which enables accelerators to take advantage of similarity in input operands and speedup the training process while also increasing energy-efficiency. ACRTC dynamically adjusts the strength of computation reuse based on the tolerance of precision relaxation in different training phases. Over a wide range of neural network topologies, ACRTC accelerates training by 33% and saves energy by 32% with negligible impact on accuracy.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"4 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Trans. Embed. Comput. Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3487025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

In recent years, Deep Neural Networks (DNNs) have been deployed into a diverse set of applications from voice recognition to scene generation mostly due to their high-accuracy. DNNs are known to be computationally intensive applications, requiring a significant power budget. There have been a large number of investigations into energy-efficiency of DNNs. However, most of them primarily focused on inference while training of DNNs has received little attention. This work proposes an adaptive technique to identify and avoid redundant computations during the training of DNNs. Elements of activations exhibit a high degree of similarity, causing inputs and outputs of layers of neural networks to perform redundant computations. Based on this observation, we propose Adaptive Computation Reuse for Tensor Cores (ACRTC) where results of previous arithmetic operations are used to avoid redundant computations. ACRTC is an architectural technique, which enables accelerators to take advantage of similarity in input operands and speedup the training process while also increasing energy-efficiency. ACRTC dynamically adjusts the strength of computation reuse based on the tolerance of precision relaxation in different training phases. Over a wide range of neural network topologies, ACRTC accelerates training by 33% and saves energy by 32% with negligible impact on accuracy.

查看原文本刊更多论文

深度神经网络节能训练的自适应计算重用

近年来，深度神经网络(Deep Neural Networks, dnn)由于其高准确性而被广泛应用于从语音识别到场景生成的各种应用中。dnn是计算密集型应用，需要大量的功率预算。人们对深度神经网络的能量效率进行了大量的研究。然而，它们大多主要集中在推理上，而dnn的训练很少受到关注。这项工作提出了一种自适应技术来识别和避免dnn训练过程中的冗余计算。激活元素表现出高度的相似性，导致神经网络层的输入和输出执行冗余计算。基于这一观察，我们提出了张量核的自适应计算重用(ACRTC)，其中使用先前的算术运算结果来避免冗余计算。ACRTC是一种架构技术，它使加速器能够利用输入操作数的相似性，加速训练过程，同时提高能源效率。ACRTC根据不同训练阶段的精度松弛容忍度动态调整计算重用的强度。在广泛的神经网络拓扑中，ACRTC加速了33%的训练，节省了32%的能量，对准确性的影响可以忽略不计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Trans. Embed. Comput. Syst.

自引率

0.00%

发文量