支持cuda的Hadoop集群用于稀疏矩阵向量乘法

M. Reza, Aman Sinha, Rajkumar Nag, P. Mohanty
{"title":"支持cuda的Hadoop集群用于稀疏矩阵向量乘法","authors":"M. Reza, Aman Sinha, Rajkumar Nag, P. Mohanty","doi":"10.1109/ReTIS.2015.7232872","DOIUrl":null,"url":null,"abstract":"Compute Unified Device Architecture (CUDA) is an architecture and programming model that allows leveraging the high compute-intensive processing power of the Graphical Processing Units (GPUs) to perform general, non-graphical tasks in a massively parallel manner. Hadoop is an open-source software framework that has its own file system, the Hadoop Distributed File System (HDFS), and its own programming model, the Map Reduce, in order to accomplish the tasks of storage of very large amount of data and their fast processing in a distributed manner in a cluster of inexpensive hardware. This paper presents a model and implementation of a Hadoop-CUDA Hybrid approach to perform Sparse Matrix Vector Multiplication (SpMV) of very large matrices in a very high performing manner. Hadoop is used for splitting the input matrix into smaller sub-matrices, storing them on individual data nodes and then invoking the required CUDA kernels on the individual GPU-possessing cluster nodes. The original SpMV is done using CUDA. Such an implementation has been seen to improve the performance of the SpMV operation over very large matrices by speedup of around 1.4 in comparison to non-Hadoop, single-GPU CUDA implementation.","PeriodicalId":161306,"journal":{"name":"2015 IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"CUDA-enabled Hadoop cluster for Sparse Matrix Vector Multiplication\",\"authors\":\"M. Reza, Aman Sinha, Rajkumar Nag, P. Mohanty\",\"doi\":\"10.1109/ReTIS.2015.7232872\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Compute Unified Device Architecture (CUDA) is an architecture and programming model that allows leveraging the high compute-intensive processing power of the Graphical Processing Units (GPUs) to perform general, non-graphical tasks in a massively parallel manner. Hadoop is an open-source software framework that has its own file system, the Hadoop Distributed File System (HDFS), and its own programming model, the Map Reduce, in order to accomplish the tasks of storage of very large amount of data and their fast processing in a distributed manner in a cluster of inexpensive hardware. This paper presents a model and implementation of a Hadoop-CUDA Hybrid approach to perform Sparse Matrix Vector Multiplication (SpMV) of very large matrices in a very high performing manner. Hadoop is used for splitting the input matrix into smaller sub-matrices, storing them on individual data nodes and then invoking the required CUDA kernels on the individual GPU-possessing cluster nodes. The original SpMV is done using CUDA. Such an implementation has been seen to improve the performance of the SpMV operation over very large matrices by speedup of around 1.4 in comparison to non-Hadoop, single-GPU CUDA implementation.\",\"PeriodicalId\":161306,\"journal\":{\"name\":\"2015 IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ReTIS.2015.7232872\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ReTIS.2015.7232872","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

计算统一设备架构(CUDA)是一种架构和编程模型,它允许利用图形处理单元(gpu)的高计算密集型处理能力,以大规模并行的方式执行一般的非图形任务。Hadoop是一个开源软件框架,它有自己的文件系统Hadoop分布式文件系统(HDFS)和自己的编程模型Map Reduce,以便在廉价硬件集群中以分布式方式完成大量数据的存储和快速处理任务。本文提出了一种Hadoop-CUDA混合方法的模型和实现,以非常高性能的方式对非常大的矩阵执行稀疏矩阵向量乘法(SpMV)。Hadoop用于将输入矩阵分割成更小的子矩阵,将它们存储在单个数据节点上,然后在单个拥有gpu的集群节点上调用所需的CUDA内核。最初的SpMV是使用CUDA完成的。与非hadoop、单gpu CUDA实现相比,这样的实现可以在非常大的矩阵上提高SpMV操作的性能,加速速度约为1.4。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
CUDA-enabled Hadoop cluster for Sparse Matrix Vector Multiplication
Compute Unified Device Architecture (CUDA) is an architecture and programming model that allows leveraging the high compute-intensive processing power of the Graphical Processing Units (GPUs) to perform general, non-graphical tasks in a massively parallel manner. Hadoop is an open-source software framework that has its own file system, the Hadoop Distributed File System (HDFS), and its own programming model, the Map Reduce, in order to accomplish the tasks of storage of very large amount of data and their fast processing in a distributed manner in a cluster of inexpensive hardware. This paper presents a model and implementation of a Hadoop-CUDA Hybrid approach to perform Sparse Matrix Vector Multiplication (SpMV) of very large matrices in a very high performing manner. Hadoop is used for splitting the input matrix into smaller sub-matrices, storing them on individual data nodes and then invoking the required CUDA kernels on the individual GPU-possessing cluster nodes. The original SpMV is done using CUDA. Such an implementation has been seen to improve the performance of the SpMV operation over very large matrices by speedup of around 1.4 in comparison to non-Hadoop, single-GPU CUDA implementation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信