Yang Wang, Dazheng Deng, Leibo Liu, Shaojun Wei, S. Yin
{"title":"LPE: Logarithm Posit Processing Element for Energy-Efficient Edge-Device Training","authors":"Yang Wang, Dazheng Deng, Leibo Liu, Shaojun Wei, S. Yin","doi":"10.1109/AICAS51828.2021.9458421","DOIUrl":null,"url":null,"abstract":"Recently, edge-device training has arisen an urgent necessity since it can enhance the model adaptability without causing high transmission cost and privacy issues. Due to the need for a wide data range and high data precision to improve accuracy, DNN training requires much wider floating-point (FP) data for convolution and complicated arithmetics for batch normalization. They lead to massive computation and memory access energy, which yields challenges for power-constrained edge-devices. This paper proposes a novel PE, called LPE, with three innovations to solve this issue. First, LPE stores the operands in the posit format, satisfying both precision and data range with lower bit-width. It reduces training latency and energy for memory access. Second, LPE transfers complicated arithmetics during training into the logarithm domain, including multiplication in convolution layer and division, square, square root in batch normalization layers. It reduces computation energy and improves throughput. Third, LPE contains a two-stage floating-point accumulation unit. It extends the computation range while using the low bit-width accumulator, enhancing precision and reducing power consumption. Evaluated with 28 nm CMOS process, our PE achieves 1.81× power and 1.35× area reduction compared with IEEE 754 float-point 16 (FP16) fused MAC while maintaining the same dynamic range. When performing training with the proposed PE unit, it can achieve 1.97× energy reduction and offer 1.68× speed up.","PeriodicalId":173204,"journal":{"name":"2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICAS51828.2021.9458421","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, edge-device training has arisen an urgent necessity since it can enhance the model adaptability without causing high transmission cost and privacy issues. Due to the need for a wide data range and high data precision to improve accuracy, DNN training requires much wider floating-point (FP) data for convolution and complicated arithmetics for batch normalization. They lead to massive computation and memory access energy, which yields challenges for power-constrained edge-devices. This paper proposes a novel PE, called LPE, with three innovations to solve this issue. First, LPE stores the operands in the posit format, satisfying both precision and data range with lower bit-width. It reduces training latency and energy for memory access. Second, LPE transfers complicated arithmetics during training into the logarithm domain, including multiplication in convolution layer and division, square, square root in batch normalization layers. It reduces computation energy and improves throughput. Third, LPE contains a two-stage floating-point accumulation unit. It extends the computation range while using the low bit-width accumulator, enhancing precision and reducing power consumption. Evaluated with 28 nm CMOS process, our PE achieves 1.81× power and 1.35× area reduction compared with IEEE 754 float-point 16 (FP16) fused MAC while maintaining the same dynamic range. When performing training with the proposed PE unit, it can achieve 1.97× energy reduction and offer 1.68× speed up.