Bridging the Performance-Programmability Gap for FPGAs via OpenCL: A Case Study with OpenDwarfs

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2016-05-01 DOI:10.1109/FCCM.2016.56

K. Krommydas, A. Helal, Anshuman Verma, Wu-chun Feng

{"title":"Bridging the Performance-Programmability Gap for FPGAs via OpenCL: A Case Study with OpenDwarfs","authors":"K. Krommydas, A. Helal, Anshuman Verma, Wu-chun Feng","doi":"10.1109/FCCM.2016.56","DOIUrl":null,"url":null,"abstract":"For decades, the streaming architecture of FPGAs has delivered accelerated performance across many application domains, such as option pricing solvers in finance, computational fluid dynamics in oil and gas, and packet processing in network routers and firewalls. However, this performance has come at the significant expense of programmability, i.e., the performance-programmability gap. In particular, FPGA developers use a hardware design language (HDL) to implement the application data path and to design hardware modules for computation pipelines, memory management, synchronization, and communication. This process requires extensive low-level knowledge of the target FPGA architecture and consumes significant development time and effort. To address this lack of programmability of FPGAs, OpenCL provides an easy-to-use and portable programming model for CPUs, GPUs, APUs, and now, FPGAs. However, this significantly improved programmability can come at the expense of performance, that is, there still remains a performance-programmability gap. To improve the performance of OpenCL kernels on FPGAs, and thus, bridge the performance-programmability gap, we apply and evaluate the effect of various optimization techniques on GEM, an N-body method from the OpenDwarfs benchmark suite.","PeriodicalId":113498,"journal":{"name":"2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2016.56","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

Abstract

For decades, the streaming architecture of FPGAs has delivered accelerated performance across many application domains, such as option pricing solvers in finance, computational fluid dynamics in oil and gas, and packet processing in network routers and firewalls. However, this performance has come at the significant expense of programmability, i.e., the performance-programmability gap. In particular, FPGA developers use a hardware design language (HDL) to implement the application data path and to design hardware modules for computation pipelines, memory management, synchronization, and communication. This process requires extensive low-level knowledge of the target FPGA architecture and consumes significant development time and effort. To address this lack of programmability of FPGAs, OpenCL provides an easy-to-use and portable programming model for CPUs, GPUs, APUs, and now, FPGAs. However, this significantly improved programmability can come at the expense of performance, that is, there still remains a performance-programmability gap. To improve the performance of OpenCL kernels on FPGAs, and thus, bridge the performance-programmability gap, we apply and evaluate the effect of various optimization techniques on GEM, an N-body method from the OpenDwarfs benchmark suite.

查看原文本刊更多论文

通过OpenCL弥合fpga的性能可编程性差距:OpenDwarfs的案例研究

几十年来，fpga的流架构在许多应用领域提供了加速的性能，例如金融中的期权定价求解器，石油和天然气中的计算流体动力学，以及网络路由器和防火墙中的数据包处理。然而，这种性能是以可编程性为代价的，即性能-可编程性差距。特别是，FPGA开发人员使用硬件设计语言(HDL)来实现应用程序数据路径，并设计用于计算管道，内存管理，同步和通信的硬件模块。这个过程需要对目标FPGA架构有广泛的低级知识，并消耗大量的开发时间和精力。为了解决fpga缺乏可编程性的问题，OpenCL为cpu、gpu、apu以及现在的fpga提供了一种易于使用和可移植的编程模型。然而，这种显著改进的可编程性可能以牺牲性能为代价，也就是说，仍然存在性能可编程性差距。为了提高OpenCL内核在fpga上的性能，从而弥合性能可编程性的差距，我们应用并评估了各种优化技术对GEM的影响，GEM是OpenDwarfs基准套件中的n体方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

自引率

0.00%

发文量