Three Dimensional Pseudo-Spectral Compressible Magnetohydrodynamic GPU Code for Astrophysical Plasma Simulation

2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW) Pub Date : 2018-10-30 DOI:10.1109/HIPCW.2018.8634104

R. Mukherjee, R. Ganesh, V. Saini, U. Maurya, N. Vydyanathan, Bharatkumar Sharma

{"title":"Three Dimensional Pseudo-Spectral Compressible Magnetohydrodynamic GPU Code for Astrophysical Plasma Simulation","authors":"R. Mukherjee, R. Ganesh, V. Saini, U. Maurya, N. Vydyanathan, Bharatkumar Sharma","doi":"10.1109/HIPCW.2018.8634104","DOIUrl":null,"url":null,"abstract":"This paper presents the benchmarking and scaling studies of a GPU accelerated three dimensional compressible magnetohydrodynamic code. The code is developed keeping an eye to explain the large and intermediate scale magnetic field generation is cosmos as well as in nuclear fusion reactors in the light of the theory given by Eugene Newman Parker. The spatial derivatives of the code are pseudo-spectral method based and the time solvers are explicit. GPU acceleration is achieved with minimal code changes through OpenACC parallelization and use of NVIDIA CUDA Fast Fourier Transform library (cuFFT). NVIDIA's unified memory is leveraged to enable oversubscription of the GPU device memory for seamless out-of-core processing of large grids. Our experimental results indicate that the GPU accelerated code is able to achieve upto two orders of magnitude speedup over a corresponding OpenMP parallel, FFTW library based code, on a NVIDIA Tesla P100 GPU. For large grids that require out-of-core processing on the GPU, we see a 7x speedup over the OpenMP, FFTW based code, on the Tesla P100 GPU. We also present performance analysis of the GPU accelerated code on different GPU architectures - Kepler, Pascal and Volta.","PeriodicalId":401060,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HIPCW.2018.8634104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

This paper presents the benchmarking and scaling studies of a GPU accelerated three dimensional compressible magnetohydrodynamic code. The code is developed keeping an eye to explain the large and intermediate scale magnetic field generation is cosmos as well as in nuclear fusion reactors in the light of the theory given by Eugene Newman Parker. The spatial derivatives of the code are pseudo-spectral method based and the time solvers are explicit. GPU acceleration is achieved with minimal code changes through OpenACC parallelization and use of NVIDIA CUDA Fast Fourier Transform library (cuFFT). NVIDIA's unified memory is leveraged to enable oversubscription of the GPU device memory for seamless out-of-core processing of large grids. Our experimental results indicate that the GPU accelerated code is able to achieve upto two orders of magnitude speedup over a corresponding OpenMP parallel, FFTW library based code, on a NVIDIA Tesla P100 GPU. For large grids that require out-of-core processing on the GPU, we see a 7x speedup over the OpenMP, FFTW based code, on the Tesla P100 GPU. We also present performance analysis of the GPU accelerated code on different GPU architectures - Kepler, Pascal and Volta.

查看原文本刊更多论文

三维伪谱可压缩磁流体力学天体物理等离子体模拟GPU代码

本文介绍了一个GPU加速的三维可压缩磁流体力学代码的基准测试和缩放研究。该代码是根据尤金·纽曼·帕克的理论开发的，旨在解释宇宙和核聚变反应堆中大、中尺度磁场的产生。代码的空间导数是基于伪谱方法的，时间求解是明确的。GPU加速是通过OpenACC并行化和使用NVIDIA CUDA快速傅立叶变换库(cuFFT)以最小的代码更改实现的。利用NVIDIA的统一内存，可以超额订购GPU设备内存，以实现大型网格的无缝外核处理。我们的实验结果表明，GPU加速代码能够比相应的OpenMP并行，基于FFTW库的代码在NVIDIA Tesla P100 GPU上实现高达两个数量级的加速。对于需要在GPU上进行核外处理的大型网格，我们看到在Tesla P100 GPU上基于FFTW的OpenMP代码的速度提高了7倍。我们还介绍了GPU加速代码在不同GPU架构(Kepler, Pascal和Volta)上的性能分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW)

自引率

0.00%

发文量