支持cuda的Hadoop集群,用于快速分布式图像处理

Ranajoy Malakar, N. Vydyanathan
{"title":"支持cuda的Hadoop集群,用于快速分布式图像处理","authors":"Ranajoy Malakar, N. Vydyanathan","doi":"10.1109/PARCOMPTECH.2013.6621392","DOIUrl":null,"url":null,"abstract":"Hadoop is a map-reduce based distributed processing framework, frequently used in the industry today, in areas of big data analysis, particularly text analysis. Graphics processing units (GPUs), on the other hand, are massively parallel platforms with attractive performance to price and power ratios, used extensively in the recent years for acceleration of data parallel computations. CUDA or Compute Unified Device Architecture is a C-based programming model proposed by NVIDIA for leveraging the parallel computing capabilities of the GPU for general purpose computations. This paper attempts to integrate CUDA acceleration into the Hadoop distributed processing framework to create a heterogeneous high performance image processing system. As Hadoop primarily is used for text analysis, this involves facilitating efficient image processing in Hadoop. Our experimental evaluations using a Adaboost based face detection algorithm indicate that CUDA-enabling a Hadoop cluster, even with low-end GPUs, can result in a 25% improvement in data processing throughput, indicating that an integration of these two technologies can help build scalable, high throughput, power and cost-efficient computing platforms.","PeriodicalId":344858,"journal":{"name":"2013 National Conference on Parallel Computing Technologies (PARCOMPTECH)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"A CUDA-enabled Hadoop cluster for fast distributed image processing\",\"authors\":\"Ranajoy Malakar, N. Vydyanathan\",\"doi\":\"10.1109/PARCOMPTECH.2013.6621392\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hadoop is a map-reduce based distributed processing framework, frequently used in the industry today, in areas of big data analysis, particularly text analysis. Graphics processing units (GPUs), on the other hand, are massively parallel platforms with attractive performance to price and power ratios, used extensively in the recent years for acceleration of data parallel computations. CUDA or Compute Unified Device Architecture is a C-based programming model proposed by NVIDIA for leveraging the parallel computing capabilities of the GPU for general purpose computations. This paper attempts to integrate CUDA acceleration into the Hadoop distributed processing framework to create a heterogeneous high performance image processing system. As Hadoop primarily is used for text analysis, this involves facilitating efficient image processing in Hadoop. Our experimental evaluations using a Adaboost based face detection algorithm indicate that CUDA-enabling a Hadoop cluster, even with low-end GPUs, can result in a 25% improvement in data processing throughput, indicating that an integration of these two technologies can help build scalable, high throughput, power and cost-efficient computing platforms.\",\"PeriodicalId\":344858,\"journal\":{\"name\":\"2013 National Conference on Parallel Computing Technologies (PARCOMPTECH)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 National Conference on Parallel Computing Technologies (PARCOMPTECH)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PARCOMPTECH.2013.6621392\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 National Conference on Parallel Computing Technologies (PARCOMPTECH)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PARCOMPTECH.2013.6621392","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

摘要

Hadoop是一个基于map-reduce的分布式处理框架,在当今的行业中,在大数据分析领域,尤其是文本分析中经常使用。另一方面,图形处理单元(gpu)是具有具有吸引力的性能价格比和功率比的大规模并行平台,近年来被广泛用于加速数据并行计算。CUDA或计算统一设备架构是NVIDIA提出的基于c语言的编程模型,用于利用GPU的并行计算能力进行通用计算。本文试图将CUDA加速集成到Hadoop分布式处理框架中,创建一个异构的高性能图像处理系统。由于Hadoop主要用于文本分析,这涉及到在Hadoop中促进高效的图像处理。我们使用基于Adaboost的人脸检测算法进行的实验评估表明,启用cuda的Hadoop集群,即使使用低端gpu,也可以使数据处理吞吐量提高25%,这表明这两种技术的集成可以帮助构建可扩展,高吞吐量,功率和成本效益的计算平台。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A CUDA-enabled Hadoop cluster for fast distributed image processing
Hadoop is a map-reduce based distributed processing framework, frequently used in the industry today, in areas of big data analysis, particularly text analysis. Graphics processing units (GPUs), on the other hand, are massively parallel platforms with attractive performance to price and power ratios, used extensively in the recent years for acceleration of data parallel computations. CUDA or Compute Unified Device Architecture is a C-based programming model proposed by NVIDIA for leveraging the parallel computing capabilities of the GPU for general purpose computations. This paper attempts to integrate CUDA acceleration into the Hadoop distributed processing framework to create a heterogeneous high performance image processing system. As Hadoop primarily is used for text analysis, this involves facilitating efficient image processing in Hadoop. Our experimental evaluations using a Adaboost based face detection algorithm indicate that CUDA-enabling a Hadoop cluster, even with low-end GPUs, can result in a 25% improvement in data processing throughput, indicating that an integration of these two technologies can help build scalable, high throughput, power and cost-efficient computing platforms.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信