Modeling and optimizing PE utilization rate for systolic array based CNN accelerators

International Conference on Electronic Technology and Information Science Pub Date : 2023-06-20 DOI:10.1117/12.2682498

Minhui Hu, Jianhua Fan, Yongyang Hu, Rui Xu, Yang Guo

引用次数: 0

Abstract

Due to its efficiency, energy-saving, and abundant data reuse, systolic array has been a popular choice for Convolutional Neural Network (CNN) accelerators. Dataflow of the systolic array defines computation mapping strategy and memory access and it is one of the most important design points of accelerators. Most conventional accelerator designs choose a single dataflow and optimize around it. This may influence the Processing Element (PE) utilization rate and cause waste of computing resources and energy. This work introduces a self-paced method to alleviate this problem. We analyse and quantify the PE utilization rate related to the three basic dataflows and build a model called PEU-sim to explore workload-oriented flexible dataflow. Experiments show by combining three dataflows, we are able to raise more than 10% of PE utilization rate for most neural networks and we get the highest of 12.4% for MobileNet.

查看原文本刊更多论文

基于收缩阵列的CNN加速器PE利用率建模与优化

由于其高效、节能和丰富的数据重用性，收缩阵列已成为卷积神经网络(CNN)加速器的热门选择。收缩数组的数据流定义了计算映射策略和存储器访问，是加速器的重要设计点之一。大多数传统的加速器设计选择单个数据流并围绕它进行优化。这会影响PE (Processing Element)的利用率，造成计算资源和能源的浪费。这项工作引入了一种自定进度的方法来缓解这个问题。我们分析和量化了与三种基本数据流相关的PE利用率，并建立了一个名为PEU-sim的模型来探索面向工作负载的灵活数据流。实验表明，通过结合三个数据流，我们可以将大多数神经网络的PE利用率提高10%以上，其中MobileNet的PE利用率最高，达到12.4%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Electronic Technology and Information Science

自引率

0.00%

发文量