Share-a-GPU: Providing Simple and Effective Time-Sharing on GPUs

2018 IEEE 25th International Conference on High Performance Computing (HiPC) Pub Date : 2018-12-01 DOI:10.1109/HiPC.2018.00041

Shaleen Garg, Kishore Kothapalli, Suresh Purini

{"title":"Share-a-GPU: Providing Simple and Effective Time-Sharing on GPUs","authors":"Shaleen Garg, Kishore Kothapalli, Suresh Purini","doi":"10.1109/HiPC.2018.00041","DOIUrl":null,"url":null,"abstract":"Time-sharing, which allows for multiple users to use a shared resource, is an important and fundamental aspect of modern computing systems. However, accelerators such as GPUs, that come without a native operating system do not support time sharing. The inability of accelerators to support time-sharing limits their applicability especially as they get deployed in Platform-as-a-Service and Resource-as-a-Service environmen ts. In the former, elastic demands may require preemption where as in the latter, fine-grained economic models of service cost can be supported with time sharing. In this paper, we extend the concept of time sharing to the GPGPU computational space using cooperative multitasking approach. Our technique is applicable to any GPGPU program written in Compute Unified Device Architecture (CUDA) API provided for C/C++ programming languages. With minimal support from the programmer, our framework incorporates process scheduling, light-weight memory management, and multi-GPU support. Our framework provides an abstraction where, in a round-robin manner, every workload can use a GPU(s) over a time quantum exclusively. We demonstrate the applicability of our scheduling framework, by running many workloads concurrently in a time sharing manner.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC.2018.00041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Time-sharing, which allows for multiple users to use a shared resource, is an important and fundamental aspect of modern computing systems. However, accelerators such as GPUs, that come without a native operating system do not support time sharing. The inability of accelerators to support time-sharing limits their applicability especially as they get deployed in Platform-as-a-Service and Resource-as-a-Service environmen ts. In the former, elastic demands may require preemption where as in the latter, fine-grained economic models of service cost can be supported with time sharing. In this paper, we extend the concept of time sharing to the GPGPU computational space using cooperative multitasking approach. Our technique is applicable to any GPGPU program written in Compute Unified Device Architecture (CUDA) API provided for C/C++ programming languages. With minimal support from the programmer, our framework incorporates process scheduling, light-weight memory management, and multi-GPU support. Our framework provides an abstraction where, in a round-robin manner, every workload can use a GPU(s) over a time quantum exclusively. We demonstrate the applicability of our scheduling framework, by running many workloads concurrently in a time sharing manner.

查看原文本刊更多论文

Share-a-GPU:在gpu上提供简单有效的分时功能

分时，它允许多个用户使用共享资源，是现代计算系统的一个重要和基本方面。但是，没有本地操作系统的gpu等加速器不支持分时。加速器无法支持分时，这限制了它们的适用性，特别是当它们部署在平台即服务和资源即服务环境中时。在前者中，弹性需求可能需要抢占，而在后者中，服务成本的细粒度经济模型可以通过分时来支持。在本文中，我们使用协作多任务方法将时间共享的概念扩展到GPGPU计算空间。我们的技术适用于用C/ c++编程语言提供的CUDA API编写的任何GPGPU程序。在最少的程序员支持下，我们的框架集成了进程调度、轻量级内存管理和多gpu支持。我们的框架提供了一种抽象，以循环的方式，每个工作负载可以在一个时间量子内独占地使用一个GPU。通过以分时方式并发运行多个工作负载，我们演示了调度框架的适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE 25th International Conference on High Performance Computing (HiPC)

自引率

0.00%

发文量