Workload-aware Dynamic GPU Resource Management in Component-based Applications

Hoda Sedighi, Daniel Gehberger, R. Glitho
{"title":"Workload-aware Dynamic GPU Resource Management in Component-based Applications","authors":"Hoda Sedighi, Daniel Gehberger, R. Glitho","doi":"10.1109/IC2E55432.2022.00030","DOIUrl":null,"url":null,"abstract":"In edge and cloud environments, using graphics processing units (GPUs) as a high-speed parallel computing device increases the performance of compute-intensive applications. Nowadays, due to the increase in the volume and complexity of data to be processed, GPUs are more actively used in component-based applications. As a result, the sequence of multiple interdependent components is co-located on the GPU and shares GPU resources. The overall application performance in this kind of application depends on the data transfer overhead and the performance of each component in the sequence. Managing the components' competitive use of shared GPU resources faces various challenges. The lack of a low-overhead and online technique for dynamic GPU resource allocation leads to imbalanced GPU usage and penalizes the overall performance. In this paper, we present efficient GPU memory and resource managers that improve overall system performance by using shared memory and dynamically assigning portions of shared GPU resources. The portions are based on the components' workload and throughput-based performance analyzer while guaranteeing the application's progress. The evaluation results show that our dynamic resource allocation method is able to improve the average performance of the applications with the various number of concurrent components by up to 29.81% over the default GPU concurrent multitasking. We also show that using shared memory results in 2x performance improvements.","PeriodicalId":415781,"journal":{"name":"2022 IEEE International Conference on Cloud Engineering (IC2E)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Cloud Engineering (IC2E)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC2E55432.2022.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In edge and cloud environments, using graphics processing units (GPUs) as a high-speed parallel computing device increases the performance of compute-intensive applications. Nowadays, due to the increase in the volume and complexity of data to be processed, GPUs are more actively used in component-based applications. As a result, the sequence of multiple interdependent components is co-located on the GPU and shares GPU resources. The overall application performance in this kind of application depends on the data transfer overhead and the performance of each component in the sequence. Managing the components' competitive use of shared GPU resources faces various challenges. The lack of a low-overhead and online technique for dynamic GPU resource allocation leads to imbalanced GPU usage and penalizes the overall performance. In this paper, we present efficient GPU memory and resource managers that improve overall system performance by using shared memory and dynamically assigning portions of shared GPU resources. The portions are based on the components' workload and throughput-based performance analyzer while guaranteeing the application's progress. The evaluation results show that our dynamic resource allocation method is able to improve the average performance of the applications with the various number of concurrent components by up to 29.81% over the default GPU concurrent multitasking. We also show that using shared memory results in 2x performance improvements.
基于组件的应用程序中工作负载感知的动态GPU资源管理
在边缘和云环境中,使用图形处理单元(gpu)作为高速并行计算设备可以提高计算密集型应用程序的性能。如今,由于要处理的数据量和复杂性的增加,gpu在基于组件的应用中得到了更积极的应用。因此,多个相互依赖的组件序列在GPU上共存,共享GPU资源。这种应用程序的整体性能取决于数据传输开销和序列中每个组件的性能。管理组件对共享GPU资源的竞争性使用面临着各种挑战。缺乏低开销和在线动态GPU资源分配技术导致GPU使用不平衡,并影响整体性能。在本文中,我们提出了高效的GPU内存和资源管理器,通过使用共享内存和动态分配部分共享GPU资源来提高整体系统性能。这些部分基于组件的工作负载和基于吞吐量的性能分析器,同时保证应用程序的进度。评估结果表明,与默认的GPU并发多任务处理相比,我们的动态资源分配方法能够将具有不同并发组件数量的应用程序的平均性能提高29.81%。我们还表明,使用共享内存可以使性能提高2倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信