{"title":"GPU加速MapReduce的实现:使用Hadoop和OpenCL处理数据和计算密集型任务","authors":"Miao Xin, Hao Li","doi":"10.1109/IJCSS.2012.22","DOIUrl":null,"url":null,"abstract":"MapReduce is an efficient distributed computing model for large-scale data processing. However, single-node performance is gradually to be the bottleneck in compute-intensive jobs. This paper presents an approach of MapReduce improvement with GPU acceleration, which is implemented by Hadoop and OpenCL. Different from other implementations, it targets at general and inexpensive hardware platform, and it is seamless-integrated with Apache Hadoop, a most widely used MapReduce framework. As a heterogeneous multi-machine and multicore architecture, it aims at both data- and compute-intensive applications. An almost 2 times performance improvement has been validated, without any farther optimization.","PeriodicalId":147619,"journal":{"name":"2012 International Joint Conference on Service Sciences","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"An Implementation of GPU Accelerated MapReduce: Using Hadoop with OpenCL for Data- and Compute-Intensive Jobs\",\"authors\":\"Miao Xin, Hao Li\",\"doi\":\"10.1109/IJCSS.2012.22\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"MapReduce is an efficient distributed computing model for large-scale data processing. However, single-node performance is gradually to be the bottleneck in compute-intensive jobs. This paper presents an approach of MapReduce improvement with GPU acceleration, which is implemented by Hadoop and OpenCL. Different from other implementations, it targets at general and inexpensive hardware platform, and it is seamless-integrated with Apache Hadoop, a most widely used MapReduce framework. As a heterogeneous multi-machine and multicore architecture, it aims at both data- and compute-intensive applications. An almost 2 times performance improvement has been validated, without any farther optimization.\",\"PeriodicalId\":147619,\"journal\":{\"name\":\"2012 International Joint Conference on Service Sciences\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 International Joint Conference on Service Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IJCSS.2012.22\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 International Joint Conference on Service Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCSS.2012.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Implementation of GPU Accelerated MapReduce: Using Hadoop with OpenCL for Data- and Compute-Intensive Jobs
MapReduce is an efficient distributed computing model for large-scale data processing. However, single-node performance is gradually to be the bottleneck in compute-intensive jobs. This paper presents an approach of MapReduce improvement with GPU acceleration, which is implemented by Hadoop and OpenCL. Different from other implementations, it targets at general and inexpensive hardware platform, and it is seamless-integrated with Apache Hadoop, a most widely used MapReduce framework. As a heterogeneous multi-machine and multicore architecture, it aims at both data- and compute-intensive applications. An almost 2 times performance improvement has been validated, without any farther optimization.