Post-training quantization for efficient FPGA-based neural network acceleration

IF 2.5 3区工程技术 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Integration-The Vlsi Journal Pub Date : 2025-08-14 DOI:10.1016/j.vlsi.2025.102508

Oumayma Bel Haj Salah , Seifeddine Messaoud , Mohamed Ali Hajjaji , Mohamed Atri , Noureddine Liouane

{"title":"Post-training quantization for efficient FPGA-based neural network acceleration","authors":"Oumayma Bel Haj Salah , Seifeddine Messaoud , Mohamed Ali Hajjaji , Mohamed Atri , Noureddine Liouane","doi":"10.1016/j.vlsi.2025.102508","DOIUrl":null,"url":null,"abstract":"<div><div>The widespread success of Convolutional Neural Networks (CNNs) in computer vision has been accompanied by soaring computational demands, often requiring high-performance GPUs for real-time inference. However, such hardware is impractical in embedded and resource-constrained environment. To address this, we propose a post-training quantization (PTQ) framework that converts CNN models from FP32 to INT8 without retraining, optimized for FPGA deployment. Using asymmetric quantization and TensorFlow Lite, we implemented VGG16 and ResNet50 on a PYNQ-Z1 Field-Programmable Gate Arrays (FPGA). The quantized VGG16 achieved a 67% increase in throughput (from 150 FPS to 250 FPS), a 68% reduction in latency, and a 52% improvement in Power-Delay Product. ResNet50 saw over 420% gain in DSP efficiency, a 3100% increase in LUT efficiency, and a 94% PDP reduction. Despite a marginal accuracy loss, both models showed significantly improved energy efficiency and performance-per-resource utilization. Our results confirm that PTQ enables scalable, low-power AI inference suitable for real-time applications on edge and embedded systems.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"105 ","pages":"Article 102508"},"PeriodicalIF":2.5000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Integration-The Vlsi Journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167926025001658","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

The widespread success of Convolutional Neural Networks (CNNs) in computer vision has been accompanied by soaring computational demands, often requiring high-performance GPUs for real-time inference. However, such hardware is impractical in embedded and resource-constrained environment. To address this, we propose a post-training quantization (PTQ) framework that converts CNN models from FP32 to INT8 without retraining, optimized for FPGA deployment. Using asymmetric quantization and TensorFlow Lite, we implemented VGG16 and ResNet50 on a PYNQ-Z1 Field-Programmable Gate Arrays (FPGA). The quantized VGG16 achieved a 67% increase in throughput (from 150 FPS to 250 FPS), a 68% reduction in latency, and a 52% improvement in Power-Delay Product. ResNet50 saw over 420% gain in DSP efficiency, a 3100% increase in LUT efficiency, and a 94% PDP reduction. Despite a marginal accuracy loss, both models showed significantly improved energy efficiency and performance-per-resource utilization. Our results confirm that PTQ enables scalable, low-power AI inference suitable for real-time applications on edge and embedded systems.

查看原文本刊更多论文

基于fpga的神经网络加速的训练后量化

卷积神经网络（cnn）在计算机视觉领域的广泛成功伴随着计算需求的飙升，通常需要高性能gpu来进行实时推理。然而，这种硬件在嵌入式和资源受限的环境中是不切实际的。为了解决这个问题，我们提出了一个训练后量化（PTQ）框架，该框架将CNN模型从FP32转换为INT8，而无需重新训练，并针对FPGA部署进行了优化。利用非对称量化和TensorFlow Lite，我们在PYNQ-Z1现场可编程门阵列（FPGA）上实现了VGG16和ResNet50。量化的VGG16实现了67%的吞吐量提高（从150 FPS到250 FPS）， 68%的延迟降低，52%的功率延迟产品改进。ResNet50的DSP效率提高了420%，LUT效率提高了3100%，PDP降低了94%。尽管精度有一定的损失，但两种模型都显示出能源效率和资源利用率的显著提高。我们的研究结果证实，PTQ能够实现适用于边缘和嵌入式系统实时应用的可扩展、低功耗AI推理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Integration-The Vlsi Journal 工程技术-工程：电子与电气

CiteScore

3.80

自引率

5.30%

发文量

107

审稿时长

6 months

期刊介绍： Integration''s aim is to cover every aspect of the VLSI area, with an emphasis on cross-fertilization between various fields of science, and the design, verification, test and applications of integrated circuits and systems, as well as closely related topics in process and device technologies. Individual issues will feature peer-reviewed tutorials and articles as well as reviews of recent publications. The intended coverage of the journal can be assessed by examining the following (non-exclusive) list of topics: Specification methods and languages; Analog/Digital Integrated Circuits and Systems; VLSI architectures; Algorithms, methods and tools for modeling, simulation, synthesis and verification of integrated circuits and systems of any complexity; Embedded systems; High-level synthesis for VLSI systems; Logic synthesis and finite automata; Testing, design-for-test and test generation algorithms; Physical design; Formal verification; Algorithms implemented in VLSI systems; Systems engineering; Heterogeneous systems.