Application Acceleration on FPGAs with OmpSs@FPGA

2018 International Conference on Field-Programmable Technology (FPT) Pub Date : 2018-12-01 DOI:10.1109/FPT.2018.00021

Jaume Bosch, Xubin Tan, Antonio Filgueras, Miquel Vidal Piñol, Marc Mateu, Daniel Jiménez-González, C. Álvarez, X. Martorell, E. Ayguadé, Jesús Labarta

{"title":"Application Acceleration on FPGAs with OmpSs@FPGA","authors":"Jaume Bosch, Xubin Tan, Antonio Filgueras, Miquel Vidal Piñol, Marc Mateu, Daniel Jiménez-González, C. Álvarez, X. Martorell, E. Ayguadé, Jesús Labarta","doi":"10.1109/FPT.2018.00021","DOIUrl":null,"url":null,"abstract":"OmpSs@FPGA is the flavor of OmpSs that allows offloading application functionality to FPGAs. Similarly to OpenMP, it is based on compiler directives. While the OpenMP specification also includes support for heterogeneous execution, we use OmpSs and OmpSs@FPGA as prototype implementation to develop new ideas for OpenMP. OmpSs@FPGA implements the tasking model with runtime support to automatically exploit all SMP and FPGA resources available in the execution platform. In this paper, we present the OmpSs@FPGA ecosystem, based on the Mercurium compiler and the Nanos++ runtime system. We show how the applications are transformed to run on the SMP cores and the FPGA. The application kernels defined as tasks to be accelerated, using the OmpSs directives are: 1) transformed by the compiler into kernels connected with the proper synchronization and communication ports, 2) extracted to intermediate files, 3) compiled through the FPGA vendor HLS tool, and 4) used to configure the FPGA. Our Nanos++ runtime system schedules the application tasks on the platform, being able to use the SMP cores and the FPGA accelerators at the same time. We present the evaluation of the OmpSs@FPGA environment with the Matrix Multiplication, Cholesky and N-Body benchmarks, showing the internal details of the execution, and the performance obtained on a Zynq Ultrascale+ MPSoC (up to 128x). The source code uses OmpSs@FPGA annotations and different Vivado HLS optimization directives are applied for acceleration.","PeriodicalId":434541,"journal":{"name":"2018 International Conference on Field-Programmable Technology (FPT)","volume":"30 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Field-Programmable Technology (FPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPT.2018.00021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 21

Abstract

OmpSs@FPGA is the flavor of OmpSs that allows offloading application functionality to FPGAs. Similarly to OpenMP, it is based on compiler directives. While the OpenMP specification also includes support for heterogeneous execution, we use OmpSs and OmpSs@FPGA as prototype implementation to develop new ideas for OpenMP. OmpSs@FPGA implements the tasking model with runtime support to automatically exploit all SMP and FPGA resources available in the execution platform. In this paper, we present the OmpSs@FPGA ecosystem, based on the Mercurium compiler and the Nanos++ runtime system. We show how the applications are transformed to run on the SMP cores and the FPGA. The application kernels defined as tasks to be accelerated, using the OmpSs directives are: 1) transformed by the compiler into kernels connected with the proper synchronization and communication ports, 2) extracted to intermediate files, 3) compiled through the FPGA vendor HLS tool, and 4) used to configure the FPGA. Our Nanos++ runtime system schedules the application tasks on the platform, being able to use the SMP cores and the FPGA accelerators at the same time. We present the evaluation of the OmpSs@FPGA environment with the Matrix Multiplication, Cholesky and N-Body benchmarks, showing the internal details of the execution, and the performance obtained on a Zynq Ultrascale+ MPSoC (up to 128x). The source code uses OmpSs@FPGA annotations and different Vivado HLS optimization directives are applied for acceleration.

查看原文本刊更多论文

fpga上的应用加速与OmpSs@FPGA

OmpSs@FPGA是一种允许将应用程序功能卸载到fpga的omps。与OpenMP类似，它基于编译器指令。虽然OpenMP规范还包括对异构执行的支持，但我们使用omps和OmpSs@FPGA作为原型实现来开发OpenMP的新想法。OmpSs@FPGA实现了具有运行时支持的任务模型，自动利用执行平台中可用的所有SMP和FPGA资源。在本文中，我们提出了OmpSs@FPGA生态系统，基于Mercurium编译器和nano++运行时系统。我们将展示如何将应用程序转换为在SMP内核和FPGA上运行。将应用程序内核定义为使用omps指令加速的任务:1)由编译器转换为与适当的同步和通信端口连接的内核，2)提取为中间文件，3)通过FPGA供应商的HLS工具编译，4)用于配置FPGA。我们的Nanos++运行时系统调度平台上的应用程序任务，能够同时使用SMP内核和FPGA加速器。我们用矩阵乘法、Cholesky和N-Body基准测试对OmpSs@FPGA环境进行了评估，显示了执行的内部细节，以及在Zynq Ultrascale+ MPSoC(高达128倍)上获得的性能。源代码使用OmpSs@FPGA注释和不同的Vivado HLS优化指令用于加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 International Conference on Field-Programmable Technology (FPT)

自引率

0.00%

发文量