Mhd Rashed Al Koutayni, Gerd Reis, Didier Stricker
{"title":"Optimization strategies for neural network deployment on FPGA: An energy-efficient real-time face detection use case","authors":"Mhd Rashed Al Koutayni, Gerd Reis, Didier Stricker","doi":"10.1016/j.iot.2025.101676","DOIUrl":null,"url":null,"abstract":"<div><div>Field programmable gate arrays (FPGAs) are considered promising platforms for accelerating deep neural networks (DNNs) due to their parallel processing capabilities and energy efficiency. However, Deploying DNNs on FPGA platforms for computer vision tasks presents unique challenges, such as limited computational resources, constrained power budgets, and the need for real-time performance. This work presents a set of optimization methodologies to enhance the efficiency of real-time DNN inference on FPGA system-on-a-chip (SoC) platforms. These optimizations include architectural modifications, fixed-point quantization, computation reordering, and parallelization. Additionally, hardware/software partitioning is employed to optimize task allocation between the processing system (PS) and programmable logic (PL), along with system integration and interface configuration. To validate these strategies, we apply them to a baseline face detection DNN (FaceBoxes) as a use case. The proposed techniques not only improve the efficiency of FaceBoxes on FPGA but also provide a roadmap for optimizing other DNN-based applications for resource-constrained platforms. Experimental results on the AMD Xilinx ZCU102 board with VGA resolution (<span><math><mrow><mn>480</mn><mo>×</mo><mn>640</mn><mo>×</mo><mn>3</mn></mrow></math></span>) input demonstrate a significant increase in efficiency, achieving real-time performance while substantially reducing dynamic energy consumption.</div></div>","PeriodicalId":29968,"journal":{"name":"Internet of Things","volume":"33 ","pages":"Article 101676"},"PeriodicalIF":6.0000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet of Things","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2542660525001908","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Field programmable gate arrays (FPGAs) are considered promising platforms for accelerating deep neural networks (DNNs) due to their parallel processing capabilities and energy efficiency. However, Deploying DNNs on FPGA platforms for computer vision tasks presents unique challenges, such as limited computational resources, constrained power budgets, and the need for real-time performance. This work presents a set of optimization methodologies to enhance the efficiency of real-time DNN inference on FPGA system-on-a-chip (SoC) platforms. These optimizations include architectural modifications, fixed-point quantization, computation reordering, and parallelization. Additionally, hardware/software partitioning is employed to optimize task allocation between the processing system (PS) and programmable logic (PL), along with system integration and interface configuration. To validate these strategies, we apply them to a baseline face detection DNN (FaceBoxes) as a use case. The proposed techniques not only improve the efficiency of FaceBoxes on FPGA but also provide a roadmap for optimizing other DNN-based applications for resource-constrained platforms. Experimental results on the AMD Xilinx ZCU102 board with VGA resolution () input demonstrate a significant increase in efficiency, achieving real-time performance while substantially reducing dynamic energy consumption.
期刊介绍:
Internet of Things; Engineering Cyber Physical Human Systems is a comprehensive journal encouraging cross collaboration between researchers, engineers and practitioners in the field of IoT & Cyber Physical Human Systems. The journal offers a unique platform to exchange scientific information on the entire breadth of technology, science, and societal applications of the IoT.
The journal will place a high priority on timely publication, and provide a home for high quality.
Furthermore, IOT is interested in publishing topical Special Issues on any aspect of IOT.