Workload-aware Dynamic GPU Resource Management in Component-based Applications

2022 IEEE International Conference on Cloud Engineering (IC2E) Pub Date : 2022-09-01 DOI:10.1109/IC2E55432.2022.00030

Hoda Sedighi, Daniel Gehberger, R. Glitho

{"title":"Workload-aware Dynamic GPU Resource Management in Component-based Applications","authors":"Hoda Sedighi, Daniel Gehberger, R. Glitho","doi":"10.1109/IC2E55432.2022.00030","DOIUrl":null,"url":null,"abstract":"In edge and cloud environments, using graphics processing units (GPUs) as a high-speed parallel computing device increases the performance of compute-intensive applications. Nowadays, due to the increase in the volume and complexity of data to be processed, GPUs are more actively used in component-based applications. As a result, the sequence of multiple interdependent components is co-located on the GPU and shares GPU resources. The overall application performance in this kind of application depends on the data transfer overhead and the performance of each component in the sequence. Managing the components' competitive use of shared GPU resources faces various challenges. The lack of a low-overhead and online technique for dynamic GPU resource allocation leads to imbalanced GPU usage and penalizes the overall performance. In this paper, we present efficient GPU memory and resource managers that improve overall system performance by using shared memory and dynamically assigning portions of shared GPU resources. The portions are based on the components' workload and throughput-based performance analyzer while guaranteeing the application's progress. The evaluation results show that our dynamic resource allocation method is able to improve the average performance of the applications with the various number of concurrent components by up to 29.81% over the default GPU concurrent multitasking. We also show that using shared memory results in 2x performance improvements.","PeriodicalId":415781,"journal":{"name":"2022 IEEE International Conference on Cloud Engineering (IC2E)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Cloud Engineering (IC2E)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC2E55432.2022.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In edge and cloud environments, using graphics processing units (GPUs) as a high-speed parallel computing device increases the performance of compute-intensive applications. Nowadays, due to the increase in the volume and complexity of data to be processed, GPUs are more actively used in component-based applications. As a result, the sequence of multiple interdependent components is co-located on the GPU and shares GPU resources. The overall application performance in this kind of application depends on the data transfer overhead and the performance of each component in the sequence. Managing the components' competitive use of shared GPU resources faces various challenges. The lack of a low-overhead and online technique for dynamic GPU resource allocation leads to imbalanced GPU usage and penalizes the overall performance. In this paper, we present efficient GPU memory and resource managers that improve overall system performance by using shared memory and dynamically assigning portions of shared GPU resources. The portions are based on the components' workload and throughput-based performance analyzer while guaranteeing the application's progress. The evaluation results show that our dynamic resource allocation method is able to improve the average performance of the applications with the various number of concurrent components by up to 29.81% over the default GPU concurrent multitasking. We also show that using shared memory results in 2x performance improvements.

查看原文本刊更多论文

基于组件的应用程序中工作负载感知的动态GPU资源管理

在边缘和云环境中，使用图形处理单元(gpu)作为高速并行计算设备可以提高计算密集型应用程序的性能。如今，由于要处理的数据量和复杂性的增加，gpu在基于组件的应用中得到了更积极的应用。因此，多个相互依赖的组件序列在GPU上共存，共享GPU资源。这种应用程序的整体性能取决于数据传输开销和序列中每个组件的性能。管理组件对共享GPU资源的竞争性使用面临着各种挑战。缺乏低开销和在线动态GPU资源分配技术导致GPU使用不平衡，并影响整体性能。在本文中，我们提出了高效的GPU内存和资源管理器，通过使用共享内存和动态分配部分共享GPU资源来提高整体系统性能。这些部分基于组件的工作负载和基于吞吐量的性能分析器，同时保证应用程序的进度。评估结果表明，与默认的GPU并发多任务处理相比，我们的动态资源分配方法能够将具有不同并发组件数量的应用程序的平均性能提高29.81%。我们还表明，使用共享内存可以使性能提高2倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Conference on Cloud Engineering (IC2E)

自引率

0.00%

发文量