A Comprehensive Comparison and Analysis of OpenACC and OpenMP 4.5 for NVIDIA GPUs

R. Usha, P. Pandey, N. Mangala
{"title":"A Comprehensive Comparison and Analysis of OpenACC and OpenMP 4.5 for NVIDIA GPUs","authors":"R. Usha, P. Pandey, N. Mangala","doi":"10.1109/HPEC43674.2020.9286203","DOIUrl":null,"url":null,"abstract":"HPC systems having accelerator attached to it is the new normal. However, programming these accelerators to get good performance is very complex and tedious. Hence, directive based programming such as OpenMP and OpenACC are gaining wide popularity for parallel programming. They simplify the programming experience by abstracting the low-level complexities from the user. In this paper, we have done an extensive comparison of OpenMP 4.5 and OpenACC for GPU programming. Performance comparison of these two APIs on NVIDIA Tesla GPUs namely, P100 and V100 has also been captured. Data Transfer times, Kernel Execution times, Total Execution times and Performance portability are the criteria for comparison. The challenges faced while parallelizing the applications using the directives thus leading to improper outputs has also been dotted.","PeriodicalId":168544,"journal":{"name":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC43674.2020.9286203","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

HPC systems having accelerator attached to it is the new normal. However, programming these accelerators to get good performance is very complex and tedious. Hence, directive based programming such as OpenMP and OpenACC are gaining wide popularity for parallel programming. They simplify the programming experience by abstracting the low-level complexities from the user. In this paper, we have done an extensive comparison of OpenMP 4.5 and OpenACC for GPU programming. Performance comparison of these two APIs on NVIDIA Tesla GPUs namely, P100 and V100 has also been captured. Data Transfer times, Kernel Execution times, Total Execution times and Performance portability are the criteria for comparison. The challenges faced while parallelizing the applications using the directives thus leading to improper outputs has also been dotted.
NVIDIA gpu的OpenACC与openmp4.5的综合比较与分析
带有加速器的高性能计算系统是新常态。然而,编程这些加速器以获得良好的性能是非常复杂和繁琐的。因此,基于指令的编程,如OpenMP和OpenACC,在并行编程中越来越受欢迎。它们从用户那里抽象出低级复杂性,从而简化了编程体验。在本文中,我们对openmp4.5和OpenACC在GPU编程方面进行了广泛的比较。这两个api在NVIDIA Tesla gpu,即P100和V100上的性能比较也被捕获。数据传输时间、内核执行时间、总执行时间和性能可移植性是比较的标准。在使用指令并行化应用程序从而导致不正确输出时所面临的挑战也被列举出来。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信