Chaim Baskin, Natan Liss, Eli Schwartz, Evgenii Zheltonozhskii, R. Giryes, A. Bronstein, A. Mendelson
{"title":"UNIQ","authors":"Chaim Baskin, Natan Liss, Eli Schwartz, Evgenii Zheltonozhskii, R. Giryes, A. Bronstein, A. Mendelson","doi":"10.1145/3444943","DOIUrl":null,"url":null,"abstract":"We present a novel method for neural network quantization. Our method, named UNIQ, emulates a non-uniform k-quantile quantizer and adapts the model to perform well with quantized weights by injecting noise to the weights at training time. As a by-product of injecting noise to weights, we find that activations can also be quantized to as low as 8-bit with only a minor accuracy degradation. Our non-uniform quantization approach provides a novel alternative to the existing uniform quantization techniques for neural networks. We further propose a novel complexity metric of number of bit operations performed (BOPs), and we show that this metric has a linear relation with logic utilization and power. We suggest evaluating the trade-off of accuracy vs. complexity (BOPs). The proposed method, when evaluated on ResNet18/34/50 and MobileNet on ImageNet, outperforms the prior state of the art both in the low-complexity regime and the high accuracy regime. We demonstrate the practical applicability of this approach, by implementing our non-uniformly quantized CNN on FPGA.","PeriodicalId":318554,"journal":{"name":"ACM Transactions on Computer Systems (TOCS)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Computer Systems (TOCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3444943","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

提出了一种新的神经网络量化方法。我们的方法UNIQ模拟了一个非均匀的k-分位数量化器,并通过在训练时向权值注入噪声使模型在量化权值时表现良好。作为向权重注入噪声的副产品,我们发现激活也可以量化到低至8位,而精度只会有轻微的下降。我们的非均匀量化方法为现有的神经网络均匀量化技术提供了一种新的选择。我们进一步提出了一种新的比特操作数复杂度度量,并证明了该度量与逻辑利用率和功耗呈线性关系。我们建议评估准确性与复杂性(BOPs)之间的权衡。当在ResNet18/34/50和ImageNet上的MobileNet上进行评估时,所提出的方法在低复杂性和高精度方面都优于现有技术。通过在FPGA上实现我们的非均匀量化CNN,我们证明了这种方法的实际适用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
UNIQ
We present a novel method for neural network quantization. Our method, named UNIQ, emulates a non-uniform k-quantile quantizer and adapts the model to perform well with quantized weights by injecting noise to the weights at training time. As a by-product of injecting noise to weights, we find that activations can also be quantized to as low as 8-bit with only a minor accuracy degradation. Our non-uniform quantization approach provides a novel alternative to the existing uniform quantization techniques for neural networks. We further propose a novel complexity metric of number of bit operations performed (BOPs), and we show that this metric has a linear relation with logic utilization and power. We suggest evaluating the trade-off of accuracy vs. complexity (BOPs). The proposed method, when evaluated on ResNet18/34/50 and MobileNet on ImageNet, outperforms the prior state of the art both in the low-complexity regime and the high accuracy regime. We demonstrate the practical applicability of this approach, by implementing our non-uniformly quantized CNN on FPGA.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信