Improving Loop Parallelization by a Combination of Static and Dynamic Analyses in HLS

Florian Dewald, Johanna Rohde, C. Hochberger, H. Mantel
{"title":"Improving Loop Parallelization by a Combination of Static and Dynamic Analyses in HLS","authors":"Florian Dewald, Johanna Rohde, C. Hochberger, H. Mantel","doi":"10.1145/3501801","DOIUrl":null,"url":null,"abstract":"High-level synthesis (HLS) can be used to create hardware accelerators for compute-intense software parts such as loop structures. Usually, this process requires significant amount of user interaction to steer kernel selection and optimizations. This can be tedious and time-consuming. In this article, we present an approach that fully autonomously finds independent loop iterations and reductions to create parallelized accelerators. We combine static analysis with information available only at runtime to maximize the parallelism exploited by the created accelerators. For loops where we see potential for parallelism, we create fully parallelized kernel implementations. If static information does not suffice to deduce independence, then we assume independence at compile time. We verify this assumption by statically created checks that are dynamically evaluated at runtime, before using the optimized kernel. Evaluating our approach, we can generate speedups for five out of seven benchmarks. With four loop iterations running in parallel, we achieve ideal speedups of up to 4× and on average speedups of 2.27×, both in comparison to an unoptimized accelerator.","PeriodicalId":162787,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3501801","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

High-level synthesis (HLS) can be used to create hardware accelerators for compute-intense software parts such as loop structures. Usually, this process requires significant amount of user interaction to steer kernel selection and optimizations. This can be tedious and time-consuming. In this article, we present an approach that fully autonomously finds independent loop iterations and reductions to create parallelized accelerators. We combine static analysis with information available only at runtime to maximize the parallelism exploited by the created accelerators. For loops where we see potential for parallelism, we create fully parallelized kernel implementations. If static information does not suffice to deduce independence, then we assume independence at compile time. We verify this assumption by statically created checks that are dynamically evaluated at runtime, before using the optimized kernel. Evaluating our approach, we can generate speedups for five out of seven benchmarks. With four loop iterations running in parallel, we achieve ideal speedups of up to 4× and on average speedups of 2.27×, both in comparison to an unoptimized accelerator.
利用静态和动态分析相结合的方法提高HLS的循环并行性
高级综合(HLS)可用于为环路结构等计算密集型软件部件创建硬件加速器。通常,这个过程需要大量的用户交互来引导内核选择和优化。这可能是乏味和耗时的。在本文中,我们提出了一种完全自主地发现独立循环迭代和缩减以创建并行加速器的方法。我们将静态分析与仅在运行时可用的信息结合起来,以最大化所创建的加速器所利用的并行性。在For循环中,我们看到了并行的潜力,我们创建了完全并行化的内核实现。如果静态信息不足以推断独立性,那么我们在编译时假定独立性。在使用优化的内核之前,我们通过静态创建的检查来验证这个假设,这些检查在运行时动态评估。评估我们的方法,我们可以为七个基准中的五个生成加速。与未优化的加速器相比,通过并行运行四个循环迭代,我们实现了高达4倍的理想加速,平均加速为2.27倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信