{"title":"Performance Evaluation of Fast Fourier Transform Application on Heterogeneous Platforms","authors":"Xiaojun Li, Yang Gao, Y. Liu","doi":"10.1109/CyberC.2011.48","DOIUrl":null,"url":null,"abstract":"Heterogeneous platforms, integrating SMPs, clusters, GPUs, FPGAs, etc. are becoming the most popular architectures of supercomputers. Achieving high performance on CPUs or GPUs requires careful consideration of their different architectures, which challenges the capability and skills of programmers. In order to overcome the portability problem, OpenCL, a free cross-platform programming standard, is proposed by Khronos Compute Working Group. However, the performance of OpenCL-based programs has not been thoroughly studied yet. Therefore, in this paper, we first design OpenFFT-Bench, an FFT application with OpenCL-based FFT and OpenGL-based real-time spectrum visualization as the benchmark. We evaluate its performance on four OpenCL programming platforms including NVIDIA CUDA, ATI Stream (GPU), ATI Stream (CPU), and Intel OpenCL. Characteristics of OpenFFT-Bench are investigated with multiple FFT sizes. Experimental results show that OpenCL and OpenGL-based applications can not only run on multiple heterogeneous platforms, but also achieve relatively high performance on GPU-based platforms.","PeriodicalId":227472,"journal":{"name":"2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery","volume":"155 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CyberC.2011.48","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Heterogeneous platforms, integrating SMPs, clusters, GPUs, FPGAs, etc. are becoming the most popular architectures of supercomputers. Achieving high performance on CPUs or GPUs requires careful consideration of their different architectures, which challenges the capability and skills of programmers. In order to overcome the portability problem, OpenCL, a free cross-platform programming standard, is proposed by Khronos Compute Working Group. However, the performance of OpenCL-based programs has not been thoroughly studied yet. Therefore, in this paper, we first design OpenFFT-Bench, an FFT application with OpenCL-based FFT and OpenGL-based real-time spectrum visualization as the benchmark. We evaluate its performance on four OpenCL programming platforms including NVIDIA CUDA, ATI Stream (GPU), ATI Stream (CPU), and Intel OpenCL. Characteristics of OpenFFT-Bench are investigated with multiple FFT sizes. Experimental results show that OpenCL and OpenGL-based applications can not only run on multiple heterogeneous platforms, but also achieve relatively high performance on GPU-based platforms.
集成smp、集群、gpu、fpga等的异构平台正在成为超级计算机最流行的架构。在cpu或gpu上实现高性能需要仔细考虑它们的不同架构,这对程序员的能力和技能提出了挑战。为了克服可移植性问题,Khronos计算工作组提出了一个免费的跨平台编程标准OpenCL。然而,基于opencl的程序的性能还没有得到深入的研究。因此,本文首先以基于opencl的FFT和基于opengl的实时频谱可视化为基准,设计了一个FFT应用OpenFFT-Bench。我们评估了它在四个OpenCL编程平台上的性能,包括NVIDIA CUDA, ATI Stream (GPU), ATI Stream (CPU)和Intel OpenCL。研究了不同FFT尺寸下OpenFFT-Bench的特性。实验结果表明,OpenCL和基于opengl的应用程序不仅可以在多种异构平台上运行,而且在基于gpu的平台上也能获得较高的性能。