Empirical Analysis of Fixed Point Precision Quantization of CNNs

Anaam Ansari, T. Ogunfunmi
{"title":"Empirical Analysis of Fixed Point Precision Quantization of CNNs","authors":"Anaam Ansari, T. Ogunfunmi","doi":"10.1109/MWSCAS.2019.8885263","DOIUrl":null,"url":null,"abstract":"Image classification, speech processing, autonomous driving, and medical diagnosis have made Convolutional Neural Networks (CNN) mainstream. Due to their success, many deep networks have been developed such as AlexNet, VGGNet, GoogleNet, ResidualNet [1]–[4],etc. Implementing these deep and complex networks in hardware is a challenge. There have been many hardware and algorithmic solutions to improve the throughput, latency and accuracy. Compression and optimization techniques help reduce the size of the model while maintaining the accuracy. Traditionally, quantization of weights and inputs are used to reduce the memory transfer and power consumption. Quantizing the outputs of layers can be a challenge since the output of each layer changes with the input. In this paper, we use quantization on the output of each layer for AlexNet and VGGNET16 sequentially to analyze the effect it has on accuracy. We use Signal to Quantization Noise Ratio (SQNR) to empirically determine the integer length (IL) as well as the fractional length (FL) for the fixed point precision. Based on our observations, we can report that accuracy is sensitive to fractional length as well as integer length. For AlexNet we observe deterioration in accuracy as the word length decreases. The results are similar in the case of VGGNET16.","PeriodicalId":287815,"journal":{"name":"2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS)","volume":"27 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MWSCAS.2019.8885263","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Image classification, speech processing, autonomous driving, and medical diagnosis have made Convolutional Neural Networks (CNN) mainstream. Due to their success, many deep networks have been developed such as AlexNet, VGGNet, GoogleNet, ResidualNet [1]–[4],etc. Implementing these deep and complex networks in hardware is a challenge. There have been many hardware and algorithmic solutions to improve the throughput, latency and accuracy. Compression and optimization techniques help reduce the size of the model while maintaining the accuracy. Traditionally, quantization of weights and inputs are used to reduce the memory transfer and power consumption. Quantizing the outputs of layers can be a challenge since the output of each layer changes with the input. In this paper, we use quantization on the output of each layer for AlexNet and VGGNET16 sequentially to analyze the effect it has on accuracy. We use Signal to Quantization Noise Ratio (SQNR) to empirically determine the integer length (IL) as well as the fractional length (FL) for the fixed point precision. Based on our observations, we can report that accuracy is sensitive to fractional length as well as integer length. For AlexNet we observe deterioration in accuracy as the word length decreases. The results are similar in the case of VGGNET16.
cnn定点精度量化的实证分析
图像分类、语音处理、自动驾驶、医疗诊断等使卷积神经网络(CNN)成为主流。由于它们的成功,许多深度网络被开发出来,如AlexNet、VGGNet、GoogleNet、ResidualNet[1] -[4]等。在硬件上实现这些深度和复杂的网络是一个挑战。已经有许多硬件和算法解决方案来提高吞吐量、延迟和准确性。压缩和优化技术有助于在保持精度的同时减小模型的尺寸。传统上,权重和输入的量化被用来减少内存传输和功耗。量化层的输出可能是一个挑战,因为每层的输出随输入而变化。在本文中,我们依次对AlexNet和VGGNET16的每层输出进行量化,分析其对精度的影响。我们使用信噪比(SQNR)来经验确定定点精度的整数长度(IL)和分数长度(FL)。根据我们的观察,我们可以报告精度对小数长度和整数长度都很敏感。对于AlexNet,随着单词长度的减少,我们观察到准确性的下降。对于VGGNET16,结果是相似的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信