Benchmarking TensorFlow Lite Quantization Algorithms for Deep Neural Networks

2022 IEEE 16th International Symposium on Applied Computational Intelligence and Informatics (SACI) Pub Date : 2022-05-25 DOI:10.1109/SACI55618.2022.9919465

Ioan Lucan Orășan, Ciprian Seiculescu, C. Căleanu

引用次数: 0

Abstract

Deploying deep neural network models on the resource constrained devices, e.g., lost-cost microcontrollers, is challenging because they are mostly limited in terms of memory footprint and computation capabilities. Quantization is one of the widely used solutions to reduce the size of a model. For parameter representation, it employs for example just 8-bit integer or less instead of 32-bit floating point. The TensorFlow Lite deep learning framework currently provides four methods for post-training quantization. The aim of this paper is to benchmark these quantization methods using various deep neural models of different sizes. The main outcomes of the paper are: (1) the compression ratio obtained for each quantization method for deep neural models of small, medium, and large sizes, (2) a comparison of the accuracy results relative to the original accuracy, and (3) a viewpoint for the decision to choose the quantization method depending on the model size.

查看原文本刊更多论文

深度神经网络TensorFlow Lite量化算法的基准测试

将深度神经网络模型部署到资源受限的设备上，例如低成本微控制器，是具有挑战性的，因为它们在内存占用和计算能力方面大多是有限的。量化是一个广泛使用的解决方案，以减少一个模型的大小。对于参数表示，它只使用8位或更小的整数，而不是32位浮点数。TensorFlow Lite深度学习框架目前提供了四种训练后量化方法。本文的目的是使用不同大小的深度神经模型对这些量化方法进行基准测试。本文的主要成果有:(1)小、中、大尺寸深度神经模型各量化方法得到的压缩比;(2)精度结果相对于原始精度的比较;(3)根据模型大小决定选择量化方法的观点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE 16th International Symposium on Applied Computational Intelligence and Informatics (SACI)

自引率

0.00%

发文量