NVIDIA gpu的OpenACC与openmp4.5的综合比较与分析

R. Usha, P. Pandey, N. Mangala
{"title":"NVIDIA gpu的OpenACC与openmp4.5的综合比较与分析","authors":"R. Usha, P. Pandey, N. Mangala","doi":"10.1109/HPEC43674.2020.9286203","DOIUrl":null,"url":null,"abstract":"HPC systems having accelerator attached to it is the new normal. However, programming these accelerators to get good performance is very complex and tedious. Hence, directive based programming such as OpenMP and OpenACC are gaining wide popularity for parallel programming. They simplify the programming experience by abstracting the low-level complexities from the user. In this paper, we have done an extensive comparison of OpenMP 4.5 and OpenACC for GPU programming. Performance comparison of these two APIs on NVIDIA Tesla GPUs namely, P100 and V100 has also been captured. Data Transfer times, Kernel Execution times, Total Execution times and Performance portability are the criteria for comparison. The challenges faced while parallelizing the applications using the directives thus leading to improper outputs has also been dotted.","PeriodicalId":168544,"journal":{"name":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Comprehensive Comparison and Analysis of OpenACC and OpenMP 4.5 for NVIDIA GPUs\",\"authors\":\"R. Usha, P. Pandey, N. Mangala\",\"doi\":\"10.1109/HPEC43674.2020.9286203\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"HPC systems having accelerator attached to it is the new normal. However, programming these accelerators to get good performance is very complex and tedious. Hence, directive based programming such as OpenMP and OpenACC are gaining wide popularity for parallel programming. They simplify the programming experience by abstracting the low-level complexities from the user. In this paper, we have done an extensive comparison of OpenMP 4.5 and OpenACC for GPU programming. Performance comparison of these two APIs on NVIDIA Tesla GPUs namely, P100 and V100 has also been captured. Data Transfer times, Kernel Execution times, Total Execution times and Performance portability are the criteria for comparison. The challenges faced while parallelizing the applications using the directives thus leading to improper outputs has also been dotted.\",\"PeriodicalId\":168544,\"journal\":{\"name\":\"2020 IEEE High Performance Extreme Computing Conference (HPEC)\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE High Performance Extreme Computing Conference (HPEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPEC43674.2020.9286203\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC43674.2020.9286203","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

带有加速器的高性能计算系统是新常态。然而,编程这些加速器以获得良好的性能是非常复杂和繁琐的。因此,基于指令的编程,如OpenMP和OpenACC,在并行编程中越来越受欢迎。它们从用户那里抽象出低级复杂性,从而简化了编程体验。在本文中,我们对openmp4.5和OpenACC在GPU编程方面进行了广泛的比较。这两个api在NVIDIA Tesla gpu,即P100和V100上的性能比较也被捕获。数据传输时间、内核执行时间、总执行时间和性能可移植性是比较的标准。在使用指令并行化应用程序从而导致不正确输出时所面临的挑战也被列举出来。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Comprehensive Comparison and Analysis of OpenACC and OpenMP 4.5 for NVIDIA GPUs
HPC systems having accelerator attached to it is the new normal. However, programming these accelerators to get good performance is very complex and tedious. Hence, directive based programming such as OpenMP and OpenACC are gaining wide popularity for parallel programming. They simplify the programming experience by abstracting the low-level complexities from the user. In this paper, we have done an extensive comparison of OpenMP 4.5 and OpenACC for GPU programming. Performance comparison of these two APIs on NVIDIA Tesla GPUs namely, P100 and V100 has also been captured. Data Transfer times, Kernel Execution times, Total Execution times and Performance portability are the criteria for comparison. The challenges faced while parallelizing the applications using the directives thus leading to improper outputs has also been dotted.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信