{"title":"Towards Model Based Approach to Hadoop Deployment and Configuration","authors":"Yicheng Huang, X. Lan, Xing Chen, Wenzhong Guo","doi":"10.1109/WISA.2015.65","DOIUrl":null,"url":null,"abstract":"Hadoop is an open source software framework of distributed processing of big data. There are many kinds of services in Hadoop ecosystem, such as HDFS, Map-Reduce, HBase, Hive, Yarn, Flume, Spark, Storm, Zookeeper, and so on, which increase the complexity of deployment and configuration. It takes plenty of time to construct a Hadoop cluster. Although there are some management tools which help administrators deploy and configure Hadoop clusters automatically, they usually provide a fixed solution. So administrators couldn't construct their Hadoop clusters according to different management requirements by the tools. Software architecture acts as a bridge between requirements and implementations. It has been used to reduce the complexity and cost mainly resulted from the difficulties faced by understanding the large-scale and complex software system. This paper proposes a model based approach to Hadoop deployment and configuration which help administrators construct Hadoop clusters in a simple but powerful enough manner. First, we provide the unified models of Hadoop software architecture, according to the domain knowledge of current Hadoop deployment and configuration. Second, we provide a framework with a set of definable rules for domain experts to describe their solutions to deploy and configure Hadoop clusters. Thus, administrators can use various custom solutions to automatically deploy and configure their Hadoop clusters according to different management requirements. In addition, a real-world experiment demonstrates the feasibility, effectiveness and benefits of the new approach to Hadoop deployment and configuration.","PeriodicalId":198938,"journal":{"name":"2015 12th Web Information System and Application Conference (WISA)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 12th Web Information System and Application Conference (WISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WISA.2015.65","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
Hadoop is an open source software framework of distributed processing of big data. There are many kinds of services in Hadoop ecosystem, such as HDFS, Map-Reduce, HBase, Hive, Yarn, Flume, Spark, Storm, Zookeeper, and so on, which increase the complexity of deployment and configuration. It takes plenty of time to construct a Hadoop cluster. Although there are some management tools which help administrators deploy and configure Hadoop clusters automatically, they usually provide a fixed solution. So administrators couldn't construct their Hadoop clusters according to different management requirements by the tools. Software architecture acts as a bridge between requirements and implementations. It has been used to reduce the complexity and cost mainly resulted from the difficulties faced by understanding the large-scale and complex software system. This paper proposes a model based approach to Hadoop deployment and configuration which help administrators construct Hadoop clusters in a simple but powerful enough manner. First, we provide the unified models of Hadoop software architecture, according to the domain knowledge of current Hadoop deployment and configuration. Second, we provide a framework with a set of definable rules for domain experts to describe their solutions to deploy and configure Hadoop clusters. Thus, administrators can use various custom solutions to automatically deploy and configure their Hadoop clusters according to different management requirements. In addition, a real-world experiment demonstrates the feasibility, effectiveness and benefits of the new approach to Hadoop deployment and configuration.