DS-CUDA:在云环境中使用多个gpu的中间件

2012 SC Companion: High Performance Computing, Networking Storage and Analysis Pub Date : 2012-11-10 DOI:10.1109/SC.Companion.2012.146

Minoru Oikawa, A. Kawai, K. Nomura, K. Yasuoka, Kazuyuki Yoshikawa, T. Narumi

{"title":"DS-CUDA:在云环境中使用多个gpu的中间件","authors":"Minoru Oikawa, A. Kawai, K. Nomura, K. Yasuoka, Kazuyuki Yoshikawa, T. Narumi","doi":"10.1109/SC.Companion.2012.146","DOIUrl":null,"url":null,"abstract":"GPGPU (General-purpose computing on graphics processing units) has several difficulties when used in cloud environment, such as narrow bandwidth, higher cost, and lower security, compared with computation using only CPUs. Most high performance computing applications require huge communication between nodes, and do not fit a cloud environment, since network topology and its bandwidth are not fixed and they affect the performance of the application program. However, there are some applications for which little communication is needed, such as molecular dynamics (MD) simulation with the replica exchange method (REM). For such applications, we propose DS-CUDA (Distributed-shared compute unified device architecture), a middleware to use many GPUs in a cloud environment with lower cost and higher security. It virtualizes GPUs in a cloud such that they appear to be locally installed GPUs in a client machine. Its redundant mechanism ensures reliable calculation with consumer GPUs, which reduce the cost greatly. It also enhances the security level since no data except command and data for GPUs are stored in the cloud side. REM-MD simulation with 64 GPUs showed 58 and 36 times more speed than a locally-installed GPU via InfiniBand and the Internet, respectively.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"132 1","pages":"1207-1214"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"73","resultStr":"{\"title\":\"DS-CUDA: A Middleware to Use Many GPUs in the Cloud Environment\",\"authors\":\"Minoru Oikawa, A. Kawai, K. Nomura, K. Yasuoka, Kazuyuki Yoshikawa, T. Narumi\",\"doi\":\"10.1109/SC.Companion.2012.146\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"GPGPU (General-purpose computing on graphics processing units) has several difficulties when used in cloud environment, such as narrow bandwidth, higher cost, and lower security, compared with computation using only CPUs. Most high performance computing applications require huge communication between nodes, and do not fit a cloud environment, since network topology and its bandwidth are not fixed and they affect the performance of the application program. However, there are some applications for which little communication is needed, such as molecular dynamics (MD) simulation with the replica exchange method (REM). For such applications, we propose DS-CUDA (Distributed-shared compute unified device architecture), a middleware to use many GPUs in a cloud environment with lower cost and higher security. It virtualizes GPUs in a cloud such that they appear to be locally installed GPUs in a client machine. Its redundant mechanism ensures reliable calculation with consumer GPUs, which reduce the cost greatly. It also enhances the security level since no data except command and data for GPUs are stored in the cloud side. REM-MD simulation with 64 GPUs showed 58 and 36 times more speed than a locally-installed GPU via InfiniBand and the Internet, respectively.\",\"PeriodicalId\":6346,\"journal\":{\"name\":\"2012 SC Companion: High Performance Computing, Networking Storage and Analysis\",\"volume\":\"132 1\",\"pages\":\"1207-1214\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"73\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 SC Companion: High Performance Computing, Networking Storage and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SC.Companion.2012.146\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC.Companion.2012.146","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 73

摘要

GPGPU (General-purpose computing on graphics processing unit，图形处理单元上的通用计算)在云环境中使用时，与仅使用cpu进行计算相比，存在带宽窄、成本高、安全性低等问题。大多数高性能计算应用需要在节点之间进行大量通信，并且不适合云环境，因为网络拓扑及其带宽不是固定的，并且会影响应用程序的性能。然而，也有一些应用程序几乎不需要通信，例如使用副本交换方法(REM)的分子动力学(MD)模拟。针对这样的应用，我们提出了DS-CUDA (Distributed-shared compute unified device architecture，分布式共享计算统一设备架构)，这是一种在云环境中使用多个gpu的中间件，具有更低的成本和更高的安全性。它在云端虚拟化gpu，使它们看起来像是在客户端机器上本地安装的gpu。它的冗余机制保证了与消费级gpu的可靠计算，大大降低了成本。除了命令和gpu的数据，没有其他数据存储在云端，提高了安全性。使用64个GPU的REM-MD模拟的速度分别是通过InfiniBand和Internet安装的本地GPU的58倍和36倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DS-CUDA: A Middleware to Use Many GPUs in the Cloud Environment

GPGPU (General-purpose computing on graphics processing units) has several difficulties when used in cloud environment, such as narrow bandwidth, higher cost, and lower security, compared with computation using only CPUs. Most high performance computing applications require huge communication between nodes, and do not fit a cloud environment, since network topology and its bandwidth are not fixed and they affect the performance of the application program. However, there are some applications for which little communication is needed, such as molecular dynamics (MD) simulation with the replica exchange method (REM). For such applications, we propose DS-CUDA (Distributed-shared compute unified device architecture), a middleware to use many GPUs in a cloud environment with lower cost and higher security. It virtualizes GPUs in a cloud such that they appear to be locally installed GPUs in a client machine. Its redundant mechanism ensures reliable calculation with consumer GPUs, which reduce the cost greatly. It also enhances the security level since no data except command and data for GPUs are stored in the cloud side. REM-MD simulation with 64 GPUs showed 58 and 36 times more speed than a locally-installed GPU via InfiniBand and the Internet, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

自引率

0.00%

发文量