分析NVIDIA gpu之间的性能和功率效率差异

Proceedings of the 51st International Conference on Parallel Processing Pub Date : 2022-08-29 DOI:10.1145/3545008.3545084

Kohei Yoshida, Rio Sageyama, Shinobu Miwa, Hayato Yamaki, H. Honda

{"title":"分析NVIDIA gpu之间的性能和功率效率差异","authors":"Kohei Yoshida, Rio Sageyama, Shinobu Miwa, Hayato Yamaki, H. Honda","doi":"10.1145/3545008.3545084","DOIUrl":null,"url":null,"abstract":"Understanding the variations in performance and power-efficiency of compute nodes is important for enhancing these factors in modern supercomputing systems. Previous studies have focused on variations in CPUs and DRAMs, but there has been little attention on GPUs. This is despite many current supercomputing systems employing GPUs (which consume a significant fraction of the power of such systems) as power-efficient accelerators for HPC applications. This paper describes the first thorough evaluation of performance and power-efficiency variations in GPUs. Specifically, we execute 48 CUDA kernels on 856 devices selected from three generations of NVIDIA GPUs (P100, V100, and A100), and analyze the impact of differences in both the CUDA kernels and GPU generation on performance and power-efficiency. Our analysis shows that there are non-negligible variations in both performance and power-efficiency, and that these variations are strongly affected by both the kernels that are running and the GPU generation.","PeriodicalId":360504,"journal":{"name":"Proceedings of the 51st International Conference on Parallel Processing","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Analyzing Performance and Power-Efficiency Variations among NVIDIA GPUs\",\"authors\":\"Kohei Yoshida, Rio Sageyama, Shinobu Miwa, Hayato Yamaki, H. Honda\",\"doi\":\"10.1145/3545008.3545084\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Understanding the variations in performance and power-efficiency of compute nodes is important for enhancing these factors in modern supercomputing systems. Previous studies have focused on variations in CPUs and DRAMs, but there has been little attention on GPUs. This is despite many current supercomputing systems employing GPUs (which consume a significant fraction of the power of such systems) as power-efficient accelerators for HPC applications. This paper describes the first thorough evaluation of performance and power-efficiency variations in GPUs. Specifically, we execute 48 CUDA kernels on 856 devices selected from three generations of NVIDIA GPUs (P100, V100, and A100), and analyze the impact of differences in both the CUDA kernels and GPU generation on performance and power-efficiency. Our analysis shows that there are non-negligible variations in both performance and power-efficiency, and that these variations are strongly affected by both the kernels that are running and the GPU generation.\",\"PeriodicalId\":360504,\"journal\":{\"name\":\"Proceedings of the 51st International Conference on Parallel Processing\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 51st International Conference on Parallel Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3545008.3545084\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 51st International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3545008.3545084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

了解计算节点的性能和能效变化对于在现代超级计算系统中增强这些因素非常重要。以前的研究主要集中在cpu和dram的变化上，但对gpu的关注很少。尽管目前许多超级计算系统都采用gpu(它消耗了此类系统的很大一部分功率)作为高性能计算应用程序的节能加速器，但这种情况仍然存在。本文描述了gpu性能和功率效率变化的首次全面评估。具体来说，我们在856台从NVIDIA三代GPU (P100、V100和A100)中选择的设备上执行了48个CUDA内核，并分析了CUDA内核和GPU生成的差异对性能和能效的影响。我们的分析表明，在性能和能效方面存在不可忽略的变化，并且这些变化受到正在运行的内核和GPU生成的强烈影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Analyzing Performance and Power-Efficiency Variations among NVIDIA GPUs

Understanding the variations in performance and power-efficiency of compute nodes is important for enhancing these factors in modern supercomputing systems. Previous studies have focused on variations in CPUs and DRAMs, but there has been little attention on GPUs. This is despite many current supercomputing systems employing GPUs (which consume a significant fraction of the power of such systems) as power-efficient accelerators for HPC applications. This paper describes the first thorough evaluation of performance and power-efficiency variations in GPUs. Specifically, we execute 48 CUDA kernels on 856 devices selected from three generations of NVIDIA GPUs (P100, V100, and A100), and analyze the impact of differences in both the CUDA kernels and GPU generation on performance and power-efficiency. Our analysis shows that there are non-negligible variations in both performance and power-efficiency, and that these variations are strongly affected by both the kernels that are running and the GPU generation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 51st International Conference on Parallel Processing

自引率

0.00%

发文量