On Data Parallelism Code Restructuring for HLS Targeting FPGAs

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2021-06-01 DOI:10.1109/IPDPSW52791.2021.00029

Renato Campos, João MP Cardoso

{"title":"On Data Parallelism Code Restructuring for HLS Targeting FPGAs","authors":"Renato Campos, João MP Cardoso","doi":"10.1109/IPDPSW52791.2021.00029","DOIUrl":null,"url":null,"abstract":"FPGAs have emerged as hardware accelerators, and in the last decade, researchers have proposed new languages and frameworks to improve the efficiency when mapping computations to FPGAs. One of the main tasks when considering the mapping of software code to FPGAs is code restructuring. Code restructuring is of paramount importance to achieve efficient FPGA-based accelerators, and its automation continues to be a challenge. This paper describes our recent work on techniques to automatically restructure and annotate C code with directives optimized for HLS targeting FPGAs. The input of our approach consists of an unfolded dataflow graph (DFG), currently obtained by a trace of the program’s execution, and restructured C code with HLS directives as output. Specifically, in this paper we propose algorithms to optimize the input DFGs and use isomorphic graph detection for exposing data-level parallelism. The experimental results show that our approach is able to generate efficient FPGA implementations, with significant speedups over the input unmodified source codes, and very competitive to implementations obtained by manual optimizations and by previous approaches. Furthermore, the experiments show that, using our approach, it is possible to extract data-parallelism in linear to quadratic time with respect to the number of nodes of the input DFG.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"130 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW52791.2021.00029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

FPGAs have emerged as hardware accelerators, and in the last decade, researchers have proposed new languages and frameworks to improve the efficiency when mapping computations to FPGAs. One of the main tasks when considering the mapping of software code to FPGAs is code restructuring. Code restructuring is of paramount importance to achieve efficient FPGA-based accelerators, and its automation continues to be a challenge. This paper describes our recent work on techniques to automatically restructure and annotate C code with directives optimized for HLS targeting FPGAs. The input of our approach consists of an unfolded dataflow graph (DFG), currently obtained by a trace of the program’s execution, and restructured C code with HLS directives as output. Specifically, in this paper we propose algorithms to optimize the input DFGs and use isomorphic graph detection for exposing data-level parallelism. The experimental results show that our approach is able to generate efficient FPGA implementations, with significant speedups over the input unmodified source codes, and very competitive to implementations obtained by manual optimizations and by previous approaches. Furthermore, the experiments show that, using our approach, it is possible to extract data-parallelism in linear to quadratic time with respect to the number of nodes of the input DFG.

查看原文本刊更多论文

HLS目标fpga的数据并行代码重构研究

fpga已经成为硬件加速器，在过去的十年中，研究人员提出了新的语言和框架来提高将计算映射到fpga时的效率。当考虑将软件代码映射到fpga时，主要任务之一是代码重构。代码重构对于实现高效的基于fpga的加速器至关重要，其自动化仍然是一个挑战。本文描述了我们最近在使用针对针对fpga的HLS优化的指令自动重构和注释C代码的技术方面的工作。我们的方法的输入包括一个未展开的数据流图(DFG)，当前通过对程序执行的跟踪获得，以及用HLS指令作为输出的重构的C代码。具体来说，在本文中，我们提出了优化输入DFGs的算法，并使用同构图检测来暴露数据级并行性。实验结果表明，我们的方法能够生成高效的FPGA实现，与输入未修改的源代码相比具有显着的加速，并且与手动优化和以前的方法相比具有很强的竞争力。此外，实验表明，使用我们的方法，可以在线性到二次时间内提取相对于输入DFG的节点数量的数据并行性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

自引率

0.00%

发文量