Scheduling Multi-tenant Cloud Workloads on Accelerator-Based Systems

SC14: International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2014-11-16 DOI:10.1109/SC.2014.47

D. Sengupta, Anshuman Goswami, K. Schwan, K. Pallavi

引用次数: 34

Abstract

Accelerator-based systems are making rapid inroads into becoming platforms of choice for high end cloud services. There is a need therefore, to move from the current model in which high performance applications explicitly and programmatically select the GPU devices on which to run, to a dynamic model where GPUs are treated as first class schedulable entities. The Strings scheduler realizes this vision by decomposing the GPU scheduling problem into a combination of load balancing and per-device scheduling. (i) Device-level scheduling efficiently uses all of a GPU's hardware resources, including its computational and data movement engines, and (ii) load balancing goes beyond obtaining high throughput, to ensure fairness through prioritizing GPU requests that have attained least service. With its methods, Strings achieves improvements in system throughput and fairness of up to 8.70× and 13%, respectively, compared to the CUDA runtime.

查看原文本刊更多论文

在基于加速器的系统上调度多租户云工作负载

基于加速器的系统正迅速成为高端云服务的首选平台。因此，有必要从当前的模型(高性能应用程序显式地、可编程地选择GPU设备在其上运行)转向一个动态模型(GPU被视为一流的可调度实体)。字符串调度程序通过将GPU调度问题分解为负载平衡和每个设备调度的组合来实现这一愿景。(i)设备级调度有效地使用GPU的所有硬件资源，包括其计算和数据移动引擎;(ii)负载平衡超越了获得高吞吐量，通过优先考虑获得最少服务的GPU请求来确保公平性。与CUDA运行时相比，string在系统吞吐量和公平性方面分别提高了8.70倍和13%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

SC14: International Conference for High Performance Computing, Networking, Storage and Analysis

自引率

0.00%

发文量