A virtual memory based runtime to support multi-tenancy in clusters with GPUs

M. Becchi, Kittisak Sajjapongse, I. Graves, A. Procter, Vignesh T. Ravi, S. Chakradhar
{"title":"A virtual memory based runtime to support multi-tenancy in clusters with GPUs","authors":"M. Becchi, Kittisak Sajjapongse, I. Graves, A. Procter, Vignesh T. Ravi, S. Chakradhar","doi":"10.1145/2287076.2287090","DOIUrl":null,"url":null,"abstract":"Graphics Processing Units (GPUs) are increasingly becoming part of HPC clusters. Nevertheless, cloud computing services and resource management frameworks targeting heterogeneous clusters including GPUs are still in their infancy. Further, GPU software stacks (e.g., CUDA driver and runtime) currently provide very limited support to concurrency.\n In this paper, we propose a runtime system that provides abstraction and sharing of GPUs, while allowing isolation of concurrent applications. A central component of our runtime is a memory manager that provides a virtual memory abstraction to the applications. Our runtime is flexible in terms of scheduling policies, and allows dynamic (as opposed to programmer-defined) binding of applications to GPUs. In addition, our framework supports dynamic load balancing, dynamic upgrade and downgrade of GPUs, and is resilient to their failures. Our runtime can be deployed in combination with VM-based cloud computing services to allow virtualization of heterogeneous clusters, or in combination with HPC cluster resource managers to form an integrated resource management infrastructure for heterogeneous clusters. Experiments conducted on a three-node cluster show that our GPU sharing scheme allows up to a 28% and a 50% performance improvement over serialized execution on short- and long-running jobs, respectively. Further, dynamic inter-node load balancing leads to an additional 18-20% performance benefit.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"63","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on High-Performance Parallel Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2287076.2287090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 63

Abstract

Graphics Processing Units (GPUs) are increasingly becoming part of HPC clusters. Nevertheless, cloud computing services and resource management frameworks targeting heterogeneous clusters including GPUs are still in their infancy. Further, GPU software stacks (e.g., CUDA driver and runtime) currently provide very limited support to concurrency. In this paper, we propose a runtime system that provides abstraction and sharing of GPUs, while allowing isolation of concurrent applications. A central component of our runtime is a memory manager that provides a virtual memory abstraction to the applications. Our runtime is flexible in terms of scheduling policies, and allows dynamic (as opposed to programmer-defined) binding of applications to GPUs. In addition, our framework supports dynamic load balancing, dynamic upgrade and downgrade of GPUs, and is resilient to their failures. Our runtime can be deployed in combination with VM-based cloud computing services to allow virtualization of heterogeneous clusters, or in combination with HPC cluster resource managers to form an integrated resource management infrastructure for heterogeneous clusters. Experiments conducted on a three-node cluster show that our GPU sharing scheme allows up to a 28% and a 50% performance improvement over serialized execution on short- and long-running jobs, respectively. Further, dynamic inter-node load balancing leads to an additional 18-20% performance benefit.
基于虚拟内存的运行时,支持gpu集群中的多租户
图形处理单元(gpu)正日益成为高性能计算集群的一部分。然而,针对异构集群(包括gpu)的云计算服务和资源管理框架仍处于起步阶段。此外,GPU软件栈(例如,CUDA驱动程序和运行时)目前对并发性的支持非常有限。在本文中,我们提出了一个运行时系统,提供gpu的抽象和共享,同时允许并发应用程序的隔离。运行时的一个核心组件是内存管理器,它为应用程序提供虚拟内存抽象。我们的运行时在调度策略方面是灵活的,并允许动态(与程序员定义的相反)将应用程序绑定到gpu。此外,我们的框架支持动态负载平衡,gpu的动态升级和降级,并对其故障具有弹性。我们的运行时可以与基于vm的云计算服务结合部署,以实现异构集群的虚拟化,或者与HPC集群资源管理器结合部署,形成异构集群的集成资源管理基础设施。在三节点集群上进行的实验表明,我们的GPU共享方案在短期和长时间运行的作业上分别比串行执行提高了28%和50%的性能。此外,动态节点间负载平衡可带来18-20%的额外性能优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信