System Integration and Optimization of AI Hardware Acceleration Architecture for Object Detection

2023 International Conference on Consumer Electronics - Taiwan (ICCE-Taiwan) Pub Date : 2023-07-17 DOI:10.1109/ICCE-Taiwan58799.2023.10226770

Chung-Bin Wu, Yi-Yen Lai, Yen-Ren Hou

引用次数: 0

Abstract

This paper proposes a system integration and optimized hardware acceleration design for the lightweight YOLOV3 model in the object detection network architecture, including the Convolution Layer, the Maxpooling Layer, the Detection Layer, the Shortcut layer, and the optimized i output layers. In addition, this paper is verified and implemented in hardware on the Xilinx Zynq UltraScale+MPSoc ZCU102FPGA platform. The operating frequency is 180 MHz. The usage of bandwidth for the Convolution and Maxpooling Layer Fusion and Shortcut and Convolution Layer Fusion can be reduced by 85.33% and 45.27%, respectively. While optimizing Maxpooling Layer and Shortcut Layer, the running time is faster than ARM CortaxA53 15 and 26 times, respectively. Furthermore, the realization and the results of the system integration are exhibited through the HDMI monitor.

查看原文本刊更多论文

面向目标检测的AI硬件加速体系结构的系统集成与优化

本文针对目标检测网络架构中的轻量级YOLOV3模型，提出了一种系统集成和优化的硬件加速设计，包括卷积层、Maxpooling层、检测层、快捷层和优化后的i输出层。此外，本文还在Xilinx Zynq UltraScale+MPSoc ZCU102FPGA平台上进行了硬件验证和实现。工作频率为180mhz。卷积层与Maxpooling层融合和快捷层与卷积层融合的带宽利用率分别降低了85.33%和45.27%。在优化Maxpooling Layer和Shortcut Layer时，运行时间分别比ARM CortaxA53快15倍和26倍。并通过HDMI显示器展示了系统集成的实现和结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 International Conference on Consumer Electronics - Taiwan (ICCE-Taiwan)

自引率

0.00%

发文量