使用Powerlists在gpu上缩放计算

Anshu S. Anand, R. Shyamasundar
{"title":"使用Powerlists在gpu上缩放计算","authors":"Anshu S. Anand, R. Shyamasundar","doi":"10.1109/HiPCW.2015.14","DOIUrl":null,"url":null,"abstract":"With the explosion of big data analytics, scaling linear algebra packages has become extremely important. Inthe context of GPUs, cuBLAS API provides a highly efficientpackage for linear algebra subroutines on a single GPU. Dueto inputs of large dimensions, it often becomes necessary tocompute over clusters. However, the package does not provide facilities for computing over a 'cluster of GPUs' efficiently. Inthis paper, we demonstrate a high level framework for scaling linear algebra computations across a cluster of GPUs, through matrix multiplication problem. In particular, we describe amethod of specifying matrices using powerlists that captures both parallelism and recursion succinctly, and automatically schedule partitioned matrices over a GPU cluster to gain the advantages of cuBLAS for computing the product of partitioned matrices over a cluster of GPUs. Our experimental results show significant performance gains, of the order ofat least 132% for large matrices over that of a single GPUcomputation. The method reflects the map-reduce paradigmwhere the matrices are mapped to appropriate partitioned matrices and sent to appropriate members of the clusters andthe results are collected to obtain the resultant matrix.","PeriodicalId":203902,"journal":{"name":"2015 IEEE 22nd International Conference on High Performance Computing Workshops","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Scaling Computation on GPUs Using Powerlists\",\"authors\":\"Anshu S. Anand, R. Shyamasundar\",\"doi\":\"10.1109/HiPCW.2015.14\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the explosion of big data analytics, scaling linear algebra packages has become extremely important. Inthe context of GPUs, cuBLAS API provides a highly efficientpackage for linear algebra subroutines on a single GPU. Dueto inputs of large dimensions, it often becomes necessary tocompute over clusters. However, the package does not provide facilities for computing over a 'cluster of GPUs' efficiently. Inthis paper, we demonstrate a high level framework for scaling linear algebra computations across a cluster of GPUs, through matrix multiplication problem. In particular, we describe amethod of specifying matrices using powerlists that captures both parallelism and recursion succinctly, and automatically schedule partitioned matrices over a GPU cluster to gain the advantages of cuBLAS for computing the product of partitioned matrices over a cluster of GPUs. Our experimental results show significant performance gains, of the order ofat least 132% for large matrices over that of a single GPUcomputation. The method reflects the map-reduce paradigmwhere the matrices are mapped to appropriate partitioned matrices and sent to appropriate members of the clusters andthe results are collected to obtain the resultant matrix.\",\"PeriodicalId\":203902,\"journal\":{\"name\":\"2015 IEEE 22nd International Conference on High Performance Computing Workshops\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 22nd International Conference on High Performance Computing Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HiPCW.2015.14\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 22nd International Conference on High Performance Computing Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPCW.2015.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

随着大数据分析的爆炸式增长,扩展线性代数包变得极其重要。在GPU环境下,cuBLAS API为单个GPU上的线性代数子程序提供了一个高效的包。由于输入的维度很大,通常需要对集群进行计算。然而,该软件包并没有为“gpu集群”提供高效的计算设施。在本文中,我们通过矩阵乘法问题演示了一个用于跨gpu集群缩放线性代数计算的高级框架。特别是,我们描述了使用powerlist指定矩阵的方法,该方法简洁地捕获了并行性和递归,并在GPU集群上自动调度分区矩阵,以获得cuBLAS在GPU集群上计算分区矩阵乘积的优势。我们的实验结果显示了显著的性能提升,与单个gpu计算相比,大型矩阵的性能至少提高了132%。该方法反映了映射-约简范式,将矩阵映射到适当的分区矩阵,并发送给簇的适当成员,并收集结果以获得最终矩阵。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Scaling Computation on GPUs Using Powerlists
With the explosion of big data analytics, scaling linear algebra packages has become extremely important. Inthe context of GPUs, cuBLAS API provides a highly efficientpackage for linear algebra subroutines on a single GPU. Dueto inputs of large dimensions, it often becomes necessary tocompute over clusters. However, the package does not provide facilities for computing over a 'cluster of GPUs' efficiently. Inthis paper, we demonstrate a high level framework for scaling linear algebra computations across a cluster of GPUs, through matrix multiplication problem. In particular, we describe amethod of specifying matrices using powerlists that captures both parallelism and recursion succinctly, and automatically schedule partitioned matrices over a GPU cluster to gain the advantages of cuBLAS for computing the product of partitioned matrices over a cluster of GPUs. Our experimental results show significant performance gains, of the order ofat least 132% for large matrices over that of a single GPUcomputation. The method reflects the map-reduce paradigmwhere the matrices are mapped to appropriate partitioned matrices and sent to appropriate members of the clusters andthe results are collected to obtain the resultant matrix.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信