A Reconfigurable Coarse-to-Fine Approach for the Execution of CNN Inference Models in Low-Power Edge Devices

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IET Computers and Digital Techniques Pub Date : 2024-12-19 DOI:10.1049/cdt2/6214436

Auangkun Rangsikunpum, Sam Amiri, Luciano Ost

{"title":"A Reconfigurable Coarse-to-Fine Approach for the Execution of CNN Inference Models in Low-Power Edge Devices","authors":"Auangkun Rangsikunpum, Sam Amiri, Luciano Ost","doi":"10.1049/cdt2/6214436","DOIUrl":null,"url":null,"abstract":"<p>Convolutional neural networks (CNNs) have evolved into essential components for a wide range of embedded applications due to their outstanding efficiency and performance. To efficiently deploy CNN inference models on resource-constrained edge devices, field programmable gate arrays (FPGAs) have become a viable processing solution because of their unique hardware characteristics, enabling flexibility, parallel computation and low-power consumption. In this regard, this work proposes an FPGA-based dynamic reconfigurable coarse-to-fine (C2F) inference of CNN models, aiming to increase power efficiency and flexibility. The proposed C2F approach first coarsely classifies related input images into superclasses and then selects the appropriate fine model(s) to recognise and classify the input images according to their bespoke categories. Furthermore, the proposed architecture can be reprogrammed to the original model using partial reconfiguration (PR) in case the typical classification is required. To efficiently utilise different fine models on low-cost FPGAs with area minimisation, ZyCAP-based PR is adopted. Results show that our approach significantly improves the classification process when object identification of only one coarse category of interest is needed. This approach can reduce energy consumption and inference time by up to 27.2% and 13.2%, respectively, which can greatly benefit resource-constrained applications.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"2024 1","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2/6214436","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Computers and Digital Techniques","FirstCategoryId":"94","ListUrlMain":"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cdt2/6214436","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Convolutional neural networks (CNNs) have evolved into essential components for a wide range of embedded applications due to their outstanding efficiency and performance. To efficiently deploy CNN inference models on resource-constrained edge devices, field programmable gate arrays (FPGAs) have become a viable processing solution because of their unique hardware characteristics, enabling flexibility, parallel computation and low-power consumption. In this regard, this work proposes an FPGA-based dynamic reconfigurable coarse-to-fine (C2F) inference of CNN models, aiming to increase power efficiency and flexibility. The proposed C2F approach first coarsely classifies related input images into superclasses and then selects the appropriate fine model(s) to recognise and classify the input images according to their bespoke categories. Furthermore, the proposed architecture can be reprogrammed to the original model using partial reconfiguration (PR) in case the typical classification is required. To efficiently utilise different fine models on low-cost FPGAs with area minimisation, ZyCAP-based PR is adopted. Results show that our approach significantly improves the classification process when object identification of only one coarse category of interest is needed. This approach can reduce energy consumption and inference time by up to 27.2% and 13.2%, respectively, which can greatly benefit resource-constrained applications.

Abstract Image

查看原文本刊更多论文

在低功耗边缘设备中执行 CNN 推断模型的可重构粗到细方法

卷积神经网络（CNN）因其出色的效率和性能，已发展成为各种嵌入式应用的重要组件。为了在资源受限的边缘设备上高效部署 CNN 推断模型，现场可编程门阵列（FPGA）因其独特的硬件特性而成为一种可行的处理解决方案，可实现灵活性、并行计算和低功耗。为此，本研究提出了一种基于 FPGA 的 CNN 模型动态可重构粗到细（C2F）推理方法，旨在提高能效和灵活性。所提出的 C2F 方法首先将相关输入图像粗分类为超类，然后选择适当的精细模型，根据定制类别对输入图像进行识别和分类。此外，在需要进行典型分类时，还可使用部分重新配置（PR）将拟议架构重新编程为原始模型。为了在低成本 FPGA 上有效利用不同的精细模型，同时最大限度地减少面积，我们采用了基于 ZyCAP 的 PR。结果表明，当只需要识别一个感兴趣的粗分类对象时，我们的方法能明显改善分类过程。这种方法可将能耗和推理时间分别减少 27.2% 和 13.2%，这对资源有限的应用大有裨益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IET Computers and Digital Techniques 工程技术-计算机：理论方法

CiteScore

3.50

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： IET Computers & Digital Techniques publishes technical papers describing recent research and development work in all aspects of digital system-on-chip design and test of electronic and embedded systems, including the development of design automation tools (methodologies, algorithms and architectures). Papers based on the problems associated with the scaling down of CMOS technology are particularly welcome. It is aimed at researchers, engineers and educators in the fields of computer and digital systems design and test. The key subject areas of interest are: Design Methods and Tools: CAD/EDA tools, hardware description languages, high-level and architectural synthesis, hardware/software co-design, platform-based design, 3D stacking and circuit design, system on-chip architectures and IP cores, embedded systems, logic synthesis, low-power design and power optimisation. Simulation, Test and Validation: electrical and timing simulation, simulation based verification, hardware/software co-simulation and validation, mixed-domain technology modelling and simulation, post-silicon validation, power analysis and estimation, interconnect modelling and signal integrity analysis, hardware trust and security, design-for-testability, embedded core testing, system-on-chip testing, on-line testing, automatic test generation and delay testing, low-power testing, reliability, fault modelling and fault tolerance. Processor and System Architectures: many-core systems, general-purpose and application specific processors, computational arithmetic for DSP applications, arithmetic and logic units, cache memories, memory management, co-processors and accelerators, systems and networks on chip, embedded cores, platforms, multiprocessors, distributed systems, communication protocols and low-power issues. Configurable Computing: embedded cores, FPGAs, rapid prototyping, adaptive computing, evolvable and statically and dynamically reconfigurable and reprogrammable systems, reconfigurable hardware. Design for variability, power and aging: design methods for variability, power and aging aware design, memories, FPGAs, IP components, 3D stacking, energy harvesting. Case Studies: emerging applications, applications in industrial designs, and design frameworks.