迈向在云中部署弹性Hadoop

Hong Mao, Zhenzhong Zhang, Bin Zhao, Limin Xiao, Li Ruan
{"title":"迈向在云中部署弹性Hadoop","authors":"Hong Mao, Zhenzhong Zhang, Bin Zhao, Limin Xiao, Li Ruan","doi":"10.1109/CyberC.2011.83","DOIUrl":null,"url":null,"abstract":"The fast development of internet application is boosting the development of cloud computing, a new paradigm of provisioning computing infrastructure and services over network. In cloud computing environment, MapReduce is often used to perform scientific computing like matrix multiplication and do data mining and information extraction on massive data. Hadoop, an open-source implementation of MapReduce, is a suitable tool to parallelly deal with these kinds of applications. While current hadoop environments are mainly deployed on physical servers manually and are lack of flexibility. This paper proposes the EHAD (Elastic Hadoop Auto-Deployer) system to creates/destroys corresponding number of VM nodes and deploys/releases hadoop environment among the VM nodes for client users in service level. We also propose multithreading and VMOP (Virtual Machine Optimized Placement) to improve the service quality of EHAD. Experiments show that our EHAD system can deploy a hadoop cluster on demand in less than 300 seconds. The multithread method could shorten the time consumption of creating 28 VMs by 3 times and VMOP policy could improve the runtime performance of hadoop cluster by 9.73 percent.","PeriodicalId":227472,"journal":{"name":"2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Towards Deploying Elastic Hadoop in the Cloud\",\"authors\":\"Hong Mao, Zhenzhong Zhang, Bin Zhao, Limin Xiao, Li Ruan\",\"doi\":\"10.1109/CyberC.2011.83\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The fast development of internet application is boosting the development of cloud computing, a new paradigm of provisioning computing infrastructure and services over network. In cloud computing environment, MapReduce is often used to perform scientific computing like matrix multiplication and do data mining and information extraction on massive data. Hadoop, an open-source implementation of MapReduce, is a suitable tool to parallelly deal with these kinds of applications. While current hadoop environments are mainly deployed on physical servers manually and are lack of flexibility. This paper proposes the EHAD (Elastic Hadoop Auto-Deployer) system to creates/destroys corresponding number of VM nodes and deploys/releases hadoop environment among the VM nodes for client users in service level. We also propose multithreading and VMOP (Virtual Machine Optimized Placement) to improve the service quality of EHAD. Experiments show that our EHAD system can deploy a hadoop cluster on demand in less than 300 seconds. The multithread method could shorten the time consumption of creating 28 VMs by 3 times and VMOP policy could improve the runtime performance of hadoop cluster by 9.73 percent.\",\"PeriodicalId\":227472,\"journal\":{\"name\":\"2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CyberC.2011.83\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CyberC.2011.83","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

摘要

互联网应用的快速发展推动了云计算的发展,云计算是一种通过网络提供计算基础设施和服务的新模式。在云计算环境中,MapReduce经常被用于进行矩阵乘法等科学计算,对海量数据进行数据挖掘和信息提取。Hadoop是MapReduce的开源实现,是并行处理这类应用程序的合适工具。而目前的hadoop环境主要是手动部署在物理服务器上,缺乏灵活性。本文提出了EHAD (Elastic Hadoop Auto-Deployer)系统,用于创建/销毁相应数量的VM节点,并在VM节点之间为服务级别的客户端用户部署/发布Hadoop环境。我们还提出了多线程和VMOP(虚拟机优化布局)来提高EHAD的服务质量。实验表明,EHAD系统可以在不到300秒的时间内按需部署一个hadoop集群。多线程方法可以将创建28个虚拟机的时间缩短3倍,VMOP策略可以将hadoop集群的运行时性能提高9.73%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Towards Deploying Elastic Hadoop in the Cloud
The fast development of internet application is boosting the development of cloud computing, a new paradigm of provisioning computing infrastructure and services over network. In cloud computing environment, MapReduce is often used to perform scientific computing like matrix multiplication and do data mining and information extraction on massive data. Hadoop, an open-source implementation of MapReduce, is a suitable tool to parallelly deal with these kinds of applications. While current hadoop environments are mainly deployed on physical servers manually and are lack of flexibility. This paper proposes the EHAD (Elastic Hadoop Auto-Deployer) system to creates/destroys corresponding number of VM nodes and deploys/releases hadoop environment among the VM nodes for client users in service level. We also propose multithreading and VMOP (Virtual Machine Optimized Placement) to improve the service quality of EHAD. Experiments show that our EHAD system can deploy a hadoop cluster on demand in less than 300 seconds. The multithread method could shorten the time consumption of creating 28 VMs by 3 times and VMOP policy could improve the runtime performance of hadoop cluster by 9.73 percent.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信