Three Dimensional Pseudo-Spectral Compressible Magnetohydrodynamic GPU Code for Astrophysical Plasma Simulation

R. Mukherjee, R. Ganesh, V. Saini, U. Maurya, N. Vydyanathan, Bharatkumar Sharma
{"title":"Three Dimensional Pseudo-Spectral Compressible Magnetohydrodynamic GPU Code for Astrophysical Plasma Simulation","authors":"R. Mukherjee, R. Ganesh, V. Saini, U. Maurya, N. Vydyanathan, Bharatkumar Sharma","doi":"10.1109/HIPCW.2018.8634104","DOIUrl":null,"url":null,"abstract":"This paper presents the benchmarking and scaling studies of a GPU accelerated three dimensional compressible magnetohydrodynamic code. The code is developed keeping an eye to explain the large and intermediate scale magnetic field generation is cosmos as well as in nuclear fusion reactors in the light of the theory given by Eugene Newman Parker. The spatial derivatives of the code are pseudo-spectral method based and the time solvers are explicit. GPU acceleration is achieved with minimal code changes through OpenACC parallelization and use of NVIDIA CUDA Fast Fourier Transform library (cuFFT). NVIDIA's unified memory is leveraged to enable oversubscription of the GPU device memory for seamless out-of-core processing of large grids. Our experimental results indicate that the GPU accelerated code is able to achieve upto two orders of magnitude speedup over a corresponding OpenMP parallel, FFTW library based code, on a NVIDIA Tesla P100 GPU. For large grids that require out-of-core processing on the GPU, we see a 7x speedup over the OpenMP, FFTW based code, on the Tesla P100 GPU. We also present performance analysis of the GPU accelerated code on different GPU architectures - Kepler, Pascal and Volta.","PeriodicalId":401060,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HIPCW.2018.8634104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

This paper presents the benchmarking and scaling studies of a GPU accelerated three dimensional compressible magnetohydrodynamic code. The code is developed keeping an eye to explain the large and intermediate scale magnetic field generation is cosmos as well as in nuclear fusion reactors in the light of the theory given by Eugene Newman Parker. The spatial derivatives of the code are pseudo-spectral method based and the time solvers are explicit. GPU acceleration is achieved with minimal code changes through OpenACC parallelization and use of NVIDIA CUDA Fast Fourier Transform library (cuFFT). NVIDIA's unified memory is leveraged to enable oversubscription of the GPU device memory for seamless out-of-core processing of large grids. Our experimental results indicate that the GPU accelerated code is able to achieve upto two orders of magnitude speedup over a corresponding OpenMP parallel, FFTW library based code, on a NVIDIA Tesla P100 GPU. For large grids that require out-of-core processing on the GPU, we see a 7x speedup over the OpenMP, FFTW based code, on the Tesla P100 GPU. We also present performance analysis of the GPU accelerated code on different GPU architectures - Kepler, Pascal and Volta.
三维伪谱可压缩磁流体力学天体物理等离子体模拟GPU代码
本文介绍了一个GPU加速的三维可压缩磁流体力学代码的基准测试和缩放研究。该代码是根据尤金·纽曼·帕克的理论开发的,旨在解释宇宙和核聚变反应堆中大、中尺度磁场的产生。代码的空间导数是基于伪谱方法的,时间求解是明确的。GPU加速是通过OpenACC并行化和使用NVIDIA CUDA快速傅立叶变换库(cuFFT)以最小的代码更改实现的。利用NVIDIA的统一内存,可以超额订购GPU设备内存,以实现大型网格的无缝外核处理。我们的实验结果表明,GPU加速代码能够比相应的OpenMP并行,基于FFTW库的代码在NVIDIA Tesla P100 GPU上实现高达两个数量级的加速。对于需要在GPU上进行核外处理的大型网格,我们看到在Tesla P100 GPU上基于FFTW的OpenMP代码的速度提高了7倍。我们还介绍了GPU加速代码在不同GPU架构(Kepler, Pascal和Volta)上的性能分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信