通过数据驱动分析了解优化后的 GPU 硬件资源使用情况

Tanzima Z. Islam, Aniruddha Marathe, Holland Schutte, Mohammad Zaeed
{"title":"通过数据驱动分析了解优化后的 GPU 硬件资源使用情况","authors":"Tanzima Z. Islam, Aniruddha Marathe, Holland Schutte, Mohammad Zaeed","doi":"arxiv-2408.10143","DOIUrl":null,"url":null,"abstract":"With heterogeneous systems, the number of GPUs per chip increases to provide\ncomputational capabilities for solving science at a nanoscopic scale. However,\nlow utilization for single GPUs defies the need to invest more money for\nexpensive ccelerators. While related work develops optimizations for improving\napplication performance, none studies how these optimizations impact hardware\nresource usage or the average GPU utilization. This paper takes a data-driven\nanalysis approach in addressing this gap by (1) characterizing how hardware\nresource usage affects device utilization, execution time, or both, (2)\npresenting a multi-objective metric to identify important application-device\ninteractions that can be optimized to improve device utilization and\napplication performance jointly, (3) studying hardware resource usage behaviors\nof several optimizations for a benchmark application, and finally (4)\nidentifying optimization opportunities for several scientific proxy\napplications based on their hardware resource usage behaviors. Furthermore, we\ndemonstrate the applicability of our methodology by applying the identified\noptimizations to a proxy application, which improves the execution time, device\nutilization and power consumption by up to 29.6%, 5.3% and 26.5% respectively.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"53 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data-Driven Analysis to Understand GPU Hardware Resource Usage of Optimizations\",\"authors\":\"Tanzima Z. Islam, Aniruddha Marathe, Holland Schutte, Mohammad Zaeed\",\"doi\":\"arxiv-2408.10143\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With heterogeneous systems, the number of GPUs per chip increases to provide\\ncomputational capabilities for solving science at a nanoscopic scale. However,\\nlow utilization for single GPUs defies the need to invest more money for\\nexpensive ccelerators. While related work develops optimizations for improving\\napplication performance, none studies how these optimizations impact hardware\\nresource usage or the average GPU utilization. This paper takes a data-driven\\nanalysis approach in addressing this gap by (1) characterizing how hardware\\nresource usage affects device utilization, execution time, or both, (2)\\npresenting a multi-objective metric to identify important application-device\\ninteractions that can be optimized to improve device utilization and\\napplication performance jointly, (3) studying hardware resource usage behaviors\\nof several optimizations for a benchmark application, and finally (4)\\nidentifying optimization opportunities for several scientific proxy\\napplications based on their hardware resource usage behaviors. Furthermore, we\\ndemonstrate the applicability of our methodology by applying the identified\\noptimizations to a proxy application, which improves the execution time, device\\nutilization and power consumption by up to 29.6%, 5.3% and 26.5% respectively.\",\"PeriodicalId\":501291,\"journal\":{\"name\":\"arXiv - CS - Performance\",\"volume\":\"53 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Performance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.10143\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.10143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

有了异构系统,每块芯片上的 GPU 数量就会增加,从而为解决纳米尺度的科学问题提供计算能力。然而,由于单个 GPU 的利用率较低,因此没有必要投入更多资金购买昂贵的加速器。虽然相关工作开发了提高应用程序性能的优化方法,但没有一项工作研究这些优化方法如何影响硬件资源的使用或 GPU 的平均利用率。本文采用数据驱动的分析方法来弥补这一不足:(1)描述硬件资源使用是如何影响设备利用率、执行时间或两者的;(2)提出一种多目标度量方法来识别重要的应用-设备交互,并对其进行优化,以共同提高设备利用率和应用性能;(3)研究针对基准应用的几种优化方法的硬件资源使用行为;最后(4)根据硬件资源使用行为识别几种科学代理应用的优化机会。此外,我们还将确定的优化应用于一个代理应用程序,从而证明了我们方法的适用性,该应用程序的执行时间、设备利用率和功耗分别提高了 29.6%、5.3% 和 26.5%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Data-Driven Analysis to Understand GPU Hardware Resource Usage of Optimizations
With heterogeneous systems, the number of GPUs per chip increases to provide computational capabilities for solving science at a nanoscopic scale. However, low utilization for single GPUs defies the need to invest more money for expensive ccelerators. While related work develops optimizations for improving application performance, none studies how these optimizations impact hardware resource usage or the average GPU utilization. This paper takes a data-driven analysis approach in addressing this gap by (1) characterizing how hardware resource usage affects device utilization, execution time, or both, (2) presenting a multi-objective metric to identify important application-device interactions that can be optimized to improve device utilization and application performance jointly, (3) studying hardware resource usage behaviors of several optimizations for a benchmark application, and finally (4) identifying optimization opportunities for several scientific proxy applications based on their hardware resource usage behaviors. Furthermore, we demonstrate the applicability of our methodology by applying the identified optimizations to a proxy application, which improves the execution time, device utilization and power consumption by up to 29.6%, 5.3% and 26.5% respectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信