在虚拟机中部署和研究Hadoop

Guanghui Xu, Feng Xu, Hongxu Ma
{"title":"在虚拟机中部署和研究Hadoop","authors":"Guanghui Xu, Feng Xu, Hongxu Ma","doi":"10.1109/ICAL.2012.6308241","DOIUrl":null,"url":null,"abstract":"Hadoop's emerging and the maturity of virtualization make it feasible to combine them together to process immense data set. To do research on Hadoop in virtual environment, an experimental environment is needed. This paper firstly introduces some technologies used such as CloudStack, MapReduce and Hadoop. Based on that, a method to deploy CloudStack is given. Then we discuss how to deploy Hadoop in virtual machines which can be obtained from CloudStack by some means, then an algorithm to solve the problem that all the virtual machines which are created by CloudStack using same template have a same hostname. After that we run some Hadoop programs under the virtual cluster, which shows that it is feasible to deploying Hadoop in this way. Then some methods to optimize Hadoop in virtual machines are discussed. From this paper, readers can follow it to set up their own Hadoop experimental environment and capture the current status and trend of optimizing Hadoop in virtual environment.","PeriodicalId":373152,"journal":{"name":"2012 IEEE International Conference on Automation and Logistics","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"52","resultStr":"{\"title\":\"Deploying and researching Hadoop in virtual machines\",\"authors\":\"Guanghui Xu, Feng Xu, Hongxu Ma\",\"doi\":\"10.1109/ICAL.2012.6308241\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hadoop's emerging and the maturity of virtualization make it feasible to combine them together to process immense data set. To do research on Hadoop in virtual environment, an experimental environment is needed. This paper firstly introduces some technologies used such as CloudStack, MapReduce and Hadoop. Based on that, a method to deploy CloudStack is given. Then we discuss how to deploy Hadoop in virtual machines which can be obtained from CloudStack by some means, then an algorithm to solve the problem that all the virtual machines which are created by CloudStack using same template have a same hostname. After that we run some Hadoop programs under the virtual cluster, which shows that it is feasible to deploying Hadoop in this way. Then some methods to optimize Hadoop in virtual machines are discussed. From this paper, readers can follow it to set up their own Hadoop experimental environment and capture the current status and trend of optimizing Hadoop in virtual environment.\",\"PeriodicalId\":373152,\"journal\":{\"name\":\"2012 IEEE International Conference on Automation and Logistics\",\"volume\":\"65 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"52\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Automation and Logistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAL.2012.6308241\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Automation and Logistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAL.2012.6308241","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 52

摘要

Hadoop的出现和虚拟化的成熟使得将它们结合起来处理海量数据集成为可能。在虚拟环境下对Hadoop进行研究,需要一个实验环境。本文首先介绍了使用到的一些技术,如CloudStack、MapReduce和Hadoop。在此基础上,给出了一种部署CloudStack的方法。然后讨论了如何通过某种方式将Hadoop部署到从CloudStack获取的虚拟机上,然后提出了一种算法来解决CloudStack使用相同模板创建的所有虚拟机具有相同主机名的问题。然后在虚拟集群下运行了一些Hadoop程序,验证了以这种方式部署Hadoop是可行的。然后讨论了在虚拟机上对Hadoop进行优化的一些方法。读者可以跟随本文搭建自己的Hadoop实验环境,了解Hadoop在虚拟环境中优化的现状和趋势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Deploying and researching Hadoop in virtual machines
Hadoop's emerging and the maturity of virtualization make it feasible to combine them together to process immense data set. To do research on Hadoop in virtual environment, an experimental environment is needed. This paper firstly introduces some technologies used such as CloudStack, MapReduce and Hadoop. Based on that, a method to deploy CloudStack is given. Then we discuss how to deploy Hadoop in virtual machines which can be obtained from CloudStack by some means, then an algorithm to solve the problem that all the virtual machines which are created by CloudStack using same template have a same hostname. After that we run some Hadoop programs under the virtual cluster, which shows that it is feasible to deploying Hadoop in this way. Then some methods to optimize Hadoop in virtual machines are discussed. From this paper, readers can follow it to set up their own Hadoop experimental environment and capture the current status and trend of optimizing Hadoop in virtual environment.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信