Implementing a Timing Error-Resilient and Energy-Efficient Near-Threshold Hardware Accelerator for Deep Neural Network Inference

IF 1.6 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Low Power Electronics and Applications Pub Date : 2022-06-06 DOI:10.3390/jlpea12020032

N. D. Gundi, Pramesh Pandey, Sanghamitra Roy, Koushik Chakraborty

{"title":"Implementing a Timing Error-Resilient and Energy-Efficient Near-Threshold Hardware Accelerator for Deep Neural Network Inference","authors":"N. D. Gundi, Pramesh Pandey, Sanghamitra Roy, Koushik Chakraborty","doi":"10.3390/jlpea12020032","DOIUrl":null,"url":null,"abstract":"Increasing processing requirements in the Artificial Intelligence (AI) realm has led to the emergence of domain-specific architectures for Deep Neural Network (DNN) applications. Tensor Processing Unit (TPU), a DNN accelerator by Google, has emerged as a front runner outclassing its contemporaries, CPUs and GPUs, in performance by 15×–30×. TPUs have been deployed in Google data centers to cater to the performance demands. However, a TPU’s performance enhancement is accompanied by a mammoth power consumption. In the pursuit of lowering the energy utilization, this paper proposes PREDITOR—a low-power TPU operating in the Near-Threshold Computing (NTC) realm. PREDITOR uses mathematical analysis to mitigate the undetectable timing errors by boosting the voltage of the selective multiplier-and-accumulator units at specific intervals to enhance the performance of the NTC TPU, thereby ensuring a high inference accuracy at low voltage. PREDITOR offers up to 3×–5× improved performance in comparison to the leading-edge error mitigation schemes with a minor loss in accuracy.","PeriodicalId":38100,"journal":{"name":"Journal of Low Power Electronics and Applications","volume":" ","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Low Power Electronics and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jlpea12020032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 1

Abstract

Increasing processing requirements in the Artificial Intelligence (AI) realm has led to the emergence of domain-specific architectures for Deep Neural Network (DNN) applications. Tensor Processing Unit (TPU), a DNN accelerator by Google, has emerged as a front runner outclassing its contemporaries, CPUs and GPUs, in performance by 15×–30×. TPUs have been deployed in Google data centers to cater to the performance demands. However, a TPU’s performance enhancement is accompanied by a mammoth power consumption. In the pursuit of lowering the energy utilization, this paper proposes PREDITOR—a low-power TPU operating in the Near-Threshold Computing (NTC) realm. PREDITOR uses mathematical analysis to mitigate the undetectable timing errors by boosting the voltage of the selective multiplier-and-accumulator units at specific intervals to enhance the performance of the NTC TPU, thereby ensuring a high inference accuracy at low voltage. PREDITOR offers up to 3×–5× improved performance in comparison to the leading-edge error mitigation schemes with a minor loss in accuracy.

查看原文本刊更多论文

实现一种用于深度神经网络推理的时间误差弹性和节能的近阈值硬件加速器

人工智能(AI)领域不断增长的处理需求导致了深度神经网络(DNN)应用领域特定架构的出现。张量处理单元(TPU)，谷歌的深度神经网络加速器，在性能上领先于同时代的cpu和gpu，领先15×-30×。在谷歌数据中心部署tpu以满足性能需求。然而，TPU的性能提升伴随着巨大的功耗。为了降低能量利用率，本文提出了一种运行在近阈值计算(NTC)领域的低功耗TPU - pre。PREDITOR使用数学分析来减轻不可检测的时序误差，通过在特定间隔提高选择性乘法器和累加器单元的电压来增强NTC TPU的性能，从而确保在低电压下的高推断精度。与领先的错误缓解方案相比，PREDITOR提供了高达3×-5×的改进性能，并且精度损失较小。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊