Modeling and optimizing PE utilization rate for systolic array based CNN accelerators

Minhui Hu, Jianhua Fan, Yongyang Hu, Rui Xu, Yang Guo
{"title":"Modeling and optimizing PE utilization rate for systolic array based CNN accelerators","authors":"Minhui Hu, Jianhua Fan, Yongyang Hu, Rui Xu, Yang Guo","doi":"10.1117/12.2682498","DOIUrl":null,"url":null,"abstract":"Due to its efficiency, energy-saving, and abundant data reuse, systolic array has been a popular choice for Convolutional Neural Network (CNN) accelerators. Dataflow of the systolic array defines computation mapping strategy and memory access and it is one of the most important design points of accelerators. Most conventional accelerator designs choose a single dataflow and optimize around it. This may influence the Processing Element (PE) utilization rate and cause waste of computing resources and energy. This work introduces a self-paced method to alleviate this problem. We analyse and quantify the PE utilization rate related to the three basic dataflows and build a model called PEU-sim to explore workload-oriented flexible dataflow. Experiments show by combining three dataflows, we are able to raise more than 10% of PE utilization rate for most neural networks and we get the highest of 12.4% for MobileNet.","PeriodicalId":440430,"journal":{"name":"International Conference on Electronic Technology and Information Science","volume":"12715 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Electronic Technology and Information Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2682498","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Due to its efficiency, energy-saving, and abundant data reuse, systolic array has been a popular choice for Convolutional Neural Network (CNN) accelerators. Dataflow of the systolic array defines computation mapping strategy and memory access and it is one of the most important design points of accelerators. Most conventional accelerator designs choose a single dataflow and optimize around it. This may influence the Processing Element (PE) utilization rate and cause waste of computing resources and energy. This work introduces a self-paced method to alleviate this problem. We analyse and quantify the PE utilization rate related to the three basic dataflows and build a model called PEU-sim to explore workload-oriented flexible dataflow. Experiments show by combining three dataflows, we are able to raise more than 10% of PE utilization rate for most neural networks and we get the highest of 12.4% for MobileNet.
基于收缩阵列的CNN加速器PE利用率建模与优化
由于其高效、节能和丰富的数据重用性,收缩阵列已成为卷积神经网络(CNN)加速器的热门选择。收缩数组的数据流定义了计算映射策略和存储器访问,是加速器的重要设计点之一。大多数传统的加速器设计选择单个数据流并围绕它进行优化。这会影响PE (Processing Element)的利用率,造成计算资源和能源的浪费。这项工作引入了一种自定进度的方法来缓解这个问题。我们分析和量化了与三种基本数据流相关的PE利用率,并建立了一个名为PEU-sim的模型来探索面向工作负载的灵活数据流。实验表明,通过结合三个数据流,我们可以将大多数神经网络的PE利用率提高10%以上,其中MobileNet的PE利用率最高,达到12.4%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信