Wei Zeng, Yuzhou Xiao, Yiru Wang, Caihua Chen, Sulan He
{"title":"改进轻量级深度学习模型的硬件加速器IP核的设计与实现","authors":"Wei Zeng, Yuzhou Xiao, Yiru Wang, Caihua Chen, Sulan He","doi":"10.1016/j.micpro.2025.105202","DOIUrl":null,"url":null,"abstract":"<div><div>Real-time multi-point, full-scene monitoring with low cost, low power consumption, low communication overhead, and front-end deployment is a current research focus in fire detection technology. This paper investigates and implements fire detection technology on the low-computation ZYNQ platform based on deep learning, aiming to provide a cost-effective, highly efficient, and reliable fire detection solution. Firstly, we propose a lightweight network model, YOLO-Fire, which incorporates modifications like replacing standard convolutions with depthwise separable convolutions, adding the ECA attention mechanism, and introducing multi-scale feature fusion to suit the memory and computational limitations of the ZYNQ device. Additionally, we designed a hardware accelerator IP core for the ZYNQ7020 platform using a specific loop tiling strategy, constraint statements, and a dual-dimensional parallel optimization of convolution input and output channels. Combined with fixed-point quantization and resource optimization, this implementation achieves efficient acceleration of convolution, pooling, and upsampling layers. Experimental results show that YOLO-Fire improves accuracy, recall, and F1-score on the BoWFire public flame dataset and a self-constructed flame dataset. Additionally, the average inference time on the ZYNQ platform is approximately 74.43 times faster than on mainstream ARM AI platforms, verifying the effectiveness of the proposed acceleration approach.</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"118 ","pages":"Article 105202"},"PeriodicalIF":2.6000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Design and implementation of a hardware accelerator IP core for improved lightweight deep learning model\",\"authors\":\"Wei Zeng, Yuzhou Xiao, Yiru Wang, Caihua Chen, Sulan He\",\"doi\":\"10.1016/j.micpro.2025.105202\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Real-time multi-point, full-scene monitoring with low cost, low power consumption, low communication overhead, and front-end deployment is a current research focus in fire detection technology. This paper investigates and implements fire detection technology on the low-computation ZYNQ platform based on deep learning, aiming to provide a cost-effective, highly efficient, and reliable fire detection solution. Firstly, we propose a lightweight network model, YOLO-Fire, which incorporates modifications like replacing standard convolutions with depthwise separable convolutions, adding the ECA attention mechanism, and introducing multi-scale feature fusion to suit the memory and computational limitations of the ZYNQ device. Additionally, we designed a hardware accelerator IP core for the ZYNQ7020 platform using a specific loop tiling strategy, constraint statements, and a dual-dimensional parallel optimization of convolution input and output channels. Combined with fixed-point quantization and resource optimization, this implementation achieves efficient acceleration of convolution, pooling, and upsampling layers. Experimental results show that YOLO-Fire improves accuracy, recall, and F1-score on the BoWFire public flame dataset and a self-constructed flame dataset. Additionally, the average inference time on the ZYNQ platform is approximately 74.43 times faster than on mainstream ARM AI platforms, verifying the effectiveness of the proposed acceleration approach.</div></div>\",\"PeriodicalId\":49815,\"journal\":{\"name\":\"Microprocessors and Microsystems\",\"volume\":\"118 \",\"pages\":\"Article 105202\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Microprocessors and Microsystems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0141933125000699\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microprocessors and Microsystems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141933125000699","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Design and implementation of a hardware accelerator IP core for improved lightweight deep learning model
Real-time multi-point, full-scene monitoring with low cost, low power consumption, low communication overhead, and front-end deployment is a current research focus in fire detection technology. This paper investigates and implements fire detection technology on the low-computation ZYNQ platform based on deep learning, aiming to provide a cost-effective, highly efficient, and reliable fire detection solution. Firstly, we propose a lightweight network model, YOLO-Fire, which incorporates modifications like replacing standard convolutions with depthwise separable convolutions, adding the ECA attention mechanism, and introducing multi-scale feature fusion to suit the memory and computational limitations of the ZYNQ device. Additionally, we designed a hardware accelerator IP core for the ZYNQ7020 platform using a specific loop tiling strategy, constraint statements, and a dual-dimensional parallel optimization of convolution input and output channels. Combined with fixed-point quantization and resource optimization, this implementation achieves efficient acceleration of convolution, pooling, and upsampling layers. Experimental results show that YOLO-Fire improves accuracy, recall, and F1-score on the BoWFire public flame dataset and a self-constructed flame dataset. Additionally, the average inference time on the ZYNQ platform is approximately 74.43 times faster than on mainstream ARM AI platforms, verifying the effectiveness of the proposed acceleration approach.
期刊介绍:
Microprocessors and Microsystems: Embedded Hardware Design (MICPRO) is a journal covering all design and architectural aspects related to embedded systems hardware. This includes different embedded system hardware platforms ranging from custom hardware via reconfigurable systems and application specific processors to general purpose embedded processors. Special emphasis is put on novel complex embedded architectures, such as systems on chip (SoC), systems on a programmable/reconfigurable chip (SoPC) and multi-processor systems on a chip (MPSoC), as well as, their memory and communication methods and structures, such as network-on-chip (NoC).
Design automation of such systems including methodologies, techniques, flows and tools for their design, as well as, novel designs of hardware components fall within the scope of this journal. Novel cyber-physical applications that use embedded systems are also central in this journal. While software is not in the main focus of this journal, methods of hardware/software co-design, as well as, application restructuring and mapping to embedded hardware platforms, that consider interplay between software and hardware components with emphasis on hardware, are also in the journal scope.