Shadowfax: scaling in heterogeneous cluster systems via GPGPU assemblies

Virtualization Technologies in Distributed Computing Pub Date : 2011-06-08 DOI:10.1145/1996121.1996124

A. Merritt, Vishakha Gupta, Abhishek Verma, Ada Gavrilovska, K. Schwan

{"title":"Shadowfax: scaling in heterogeneous cluster systems via GPGPU assemblies","authors":"A. Merritt, Vishakha Gupta, Abhishek Verma, Ada Gavrilovska, K. Schwan","doi":"10.1145/1996121.1996124","DOIUrl":null,"url":null,"abstract":"Systems with specialized processors such as those used for accel- erating computations (like NVIDIA's graphics processors or IBM's Cell) have proven their utility in terms of higher performance and lower power consumption. They have also been shown to outperform general purpose processors in case of graphics intensive or high performance applications and for enterprise applications like modern financial codes or web hosts that require scalable image processing. These facts are causing tremendous growth in accelerator-based platforms in the high performance domain with systems like Keeneland, supercomputers like Tianhe-1, RoadRunner and even in data center systems like Amazon's EC2.\n The physical hardware in these systems, once purchased and assembled, is not reconfigurable and is expensive to modify or upgrade. This can eventually limit applications' performance and scalability unless they are rewritten to match specific versions of hardware and compositions of components, both for single nodes and for clusters of machines. To address this problem and to support increased flexibility in usage models for CUDA-based GPGPU applications, our research proposes GPGPU assemblies, where each assembly combines a desired number of CPUs and CUDA-supported GPGPUs to form a 'virtual execution platform' for an application. System-level software, then, creates and manages assemblies, including mapping them seamlessly to the actual cluster- and node- level hardware resources present in the system. Experimental evaluations of the initial implementation of GPGPU assemblies demonstrates their feasibility and advantages derived from their use.","PeriodicalId":176127,"journal":{"name":"Virtualization Technologies in Distributed Computing","volume":"164 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"40","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virtualization Technologies in Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1996121.1996124","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 40

Abstract

Systems with specialized processors such as those used for accel- erating computations (like NVIDIA's graphics processors or IBM's Cell) have proven their utility in terms of higher performance and lower power consumption. They have also been shown to outperform general purpose processors in case of graphics intensive or high performance applications and for enterprise applications like modern financial codes or web hosts that require scalable image processing. These facts are causing tremendous growth in accelerator-based platforms in the high performance domain with systems like Keeneland, supercomputers like Tianhe-1, RoadRunner and even in data center systems like Amazon's EC2. The physical hardware in these systems, once purchased and assembled, is not reconfigurable and is expensive to modify or upgrade. This can eventually limit applications' performance and scalability unless they are rewritten to match specific versions of hardware and compositions of components, both for single nodes and for clusters of machines. To address this problem and to support increased flexibility in usage models for CUDA-based GPGPU applications, our research proposes GPGPU assemblies, where each assembly combines a desired number of CPUs and CUDA-supported GPGPUs to form a 'virtual execution platform' for an application. System-level software, then, creates and manages assemblies, including mapping them seamlessly to the actual cluster- and node- level hardware resources present in the system. Experimental evaluations of the initial implementation of GPGPU assemblies demonstrates their feasibility and advantages derived from their use.

查看原文本刊更多论文

Shadowfax:通过GPGPU组件扩展异构集群系统

具有专门处理器的系统，如用于加速计算的处理器(如NVIDIA的图形处理器或IBM的Cell)，已经证明了它们在更高性能和更低功耗方面的实用性。在图形密集型或高性能应用程序以及需要可扩展图像处理的现代金融代码或web主机等企业应用程序中，它们也被证明优于通用处理器。这些事实导致了高性能领域基于加速器的平台的巨大增长，比如Keeneland系统、天河一号、RoadRunner等超级计算机，甚至是亚马逊EC2等数据中心系统。在这些系统中的物理硬件，一旦购买和组装，是不可重新配置和昂贵的修改或升级。这最终会限制应用程序的性能和可伸缩性，除非重写它们以匹配特定版本的硬件和组件组合，无论是针对单个节点还是针对机器集群。为了解决这个问题并支持基于cuda的GPGPU应用程序使用模型的灵活性增加，我们的研究提出了GPGPU组件，其中每个组件结合了所需数量的cpu和cuda支持的GPGPU，以形成应用程序的“虚拟执行平台”。然后，系统级软件创建和管理程序集，包括将它们无缝地映射到系统中存在的实际集群和节点级硬件资源。GPGPU组件初始实现的实验评估证明了它们的可行性和使用优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Virtualization Technologies in Distributed Computing

自引率

0.00%

发文量