基于BN尺度因子的CNN量化混合位宽搜索策略

Xuecong Han, Xulin Zhou, Zhongjian Ma
{"title":"基于BN尺度因子的CNN量化混合位宽搜索策略","authors":"Xuecong Han, Xulin Zhou, Zhongjian Ma","doi":"10.1145/3517077.3517108","DOIUrl":null,"url":null,"abstract":"In recent years, the rapid development of mixed-precision quantification technology has greatly reduced the scale of the model and the amount of calculation. However, the previous mixed bit-width strategies are too complicated, such as reinforcement learning strategies and Hessian matrix strategies. This paper proposes an efficient mixed bit-width searching strategy, which measures the sensitivity of the convolutional layer by the scale factors of the BN layer. The advantage of this strategy is that the parameters of the pre-trained model are used and no extra computation is introduced, which greatly simplifies the complexity of the bit-width selection strategy. In this paper, Resnet18 and Resnet50 models are used to conduct comparative experiments, and the differences between the proposed strategy and several previous algorithms are compared in terms of accuracy, model size and computation amount. It is verified that the accuracy of quantization in this paper is reduced within 2% compared with FP32 baseline, and the accuracy is reduced with about 0.5% compared with HAWQ. Overall, the performance is similar to that of HAWQ. This paper also compares the calculation complexity of the quantized bit-width of HAWQ-V3 with the calculation complexity of the quantized bit-width of this paper, which proves that the computational complexity of the strategy in this paper is far less than that of HAWQ-V3.","PeriodicalId":233686,"journal":{"name":"2022 7th International Conference on Multimedia and Image Processing","volume":"181 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Efficient Mixed Bit-width Searching Strategy for CNN Quantization based on BN Scale Factors\",\"authors\":\"Xuecong Han, Xulin Zhou, Zhongjian Ma\",\"doi\":\"10.1145/3517077.3517108\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the rapid development of mixed-precision quantification technology has greatly reduced the scale of the model and the amount of calculation. However, the previous mixed bit-width strategies are too complicated, such as reinforcement learning strategies and Hessian matrix strategies. This paper proposes an efficient mixed bit-width searching strategy, which measures the sensitivity of the convolutional layer by the scale factors of the BN layer. The advantage of this strategy is that the parameters of the pre-trained model are used and no extra computation is introduced, which greatly simplifies the complexity of the bit-width selection strategy. In this paper, Resnet18 and Resnet50 models are used to conduct comparative experiments, and the differences between the proposed strategy and several previous algorithms are compared in terms of accuracy, model size and computation amount. It is verified that the accuracy of quantization in this paper is reduced within 2% compared with FP32 baseline, and the accuracy is reduced with about 0.5% compared with HAWQ. Overall, the performance is similar to that of HAWQ. This paper also compares the calculation complexity of the quantized bit-width of HAWQ-V3 with the calculation complexity of the quantized bit-width of this paper, which proves that the computational complexity of the strategy in this paper is far less than that of HAWQ-V3.\",\"PeriodicalId\":233686,\"journal\":{\"name\":\"2022 7th International Conference on Multimedia and Image Processing\",\"volume\":\"181 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 7th International Conference on Multimedia and Image Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3517077.3517108\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 7th International Conference on Multimedia and Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3517077.3517108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

近年来,混合精度量化技术的快速发展,大大降低了模型的规模和计算量。然而,以往的混合位宽策略过于复杂,如强化学习策略和Hessian矩阵策略。本文提出了一种高效的混合位宽搜索策略,该策略通过BN层的尺度因子来衡量卷积层的灵敏度。该策略的优点是使用预训练模型的参数,不引入额外的计算量,大大简化了位宽选择策略的复杂度。本文使用Resnet18和Resnet50模型进行对比实验,比较本文提出的策略与之前几种算法在准确率、模型大小、计算量等方面的差异。验证了本文量化的精度与FP32基线相比降低了2%以内,与HAWQ相比降低了0.5%左右。总体而言,性能与HAWQ相似。本文还将HAWQ-V3量化位宽的计算复杂度与本文量化位宽的计算复杂度进行了比较,证明本文策略的计算复杂度远远小于HAWQ-V3。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An Efficient Mixed Bit-width Searching Strategy for CNN Quantization based on BN Scale Factors
In recent years, the rapid development of mixed-precision quantification technology has greatly reduced the scale of the model and the amount of calculation. However, the previous mixed bit-width strategies are too complicated, such as reinforcement learning strategies and Hessian matrix strategies. This paper proposes an efficient mixed bit-width searching strategy, which measures the sensitivity of the convolutional layer by the scale factors of the BN layer. The advantage of this strategy is that the parameters of the pre-trained model are used and no extra computation is introduced, which greatly simplifies the complexity of the bit-width selection strategy. In this paper, Resnet18 and Resnet50 models are used to conduct comparative experiments, and the differences between the proposed strategy and several previous algorithms are compared in terms of accuracy, model size and computation amount. It is verified that the accuracy of quantization in this paper is reduced within 2% compared with FP32 baseline, and the accuracy is reduced with about 0.5% compared with HAWQ. Overall, the performance is similar to that of HAWQ. This paper also compares the calculation complexity of the quantized bit-width of HAWQ-V3 with the calculation complexity of the quantized bit-width of this paper, which proves that the computational complexity of the strategy in this paper is far less than that of HAWQ-V3.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信