私有云和公有云上MapReduce应用的资源伸缩性能变化

2014 IEEE 7th International Conference on Cloud Computing Pub Date : 2014-06-27 DOI:10.1109/CLOUD.2014.68

Fan Zhang, M. Sakr

{"title":"私有云和公有云上MapReduce应用的资源伸缩性能变化","authors":"Fan Zhang, M. Sakr","doi":"10.1109/CLOUD.2014.68","DOIUrl":null,"url":null,"abstract":"In this paper, we delineate the causes of performance variations when scaling provisioned virtual resources for a variety of MapReduce applications. Hadoop MapReduce facilitates the development and execution processes of large-scale batch applications on big data. However, provisioning suitable resources to achieve desired performance at an affordable cost requires expertise into the execution model of MapReduce, the resources available for provisioning and the execution behavior of the application at hand. As an initial step towards automating this process, we characterize the difference in execution response for different MapReduce applications while varying the number of virtualized CPUs and memory resources, number of map slots as well as cluster size on a private cloud. This characterization helps illustrate the performance variation, 5x compared to 36x speedup, of Reduce-intensive and Map-intensive applications at effectively utilizing provisioned resources at different scales (1-64 VMs). By comparing the scalability efficiency, we clearly indicate the under-provisioning or over-provisioning of resources for different MapReduce applications at large scale.","PeriodicalId":288542,"journal":{"name":"2014 IEEE 7th International Conference on Cloud Computing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Performance Variations in Resource Scaling for MapReduce Applications on Private and Public Clouds\",\"authors\":\"Fan Zhang, M. Sakr\",\"doi\":\"10.1109/CLOUD.2014.68\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we delineate the causes of performance variations when scaling provisioned virtual resources for a variety of MapReduce applications. Hadoop MapReduce facilitates the development and execution processes of large-scale batch applications on big data. However, provisioning suitable resources to achieve desired performance at an affordable cost requires expertise into the execution model of MapReduce, the resources available for provisioning and the execution behavior of the application at hand. As an initial step towards automating this process, we characterize the difference in execution response for different MapReduce applications while varying the number of virtualized CPUs and memory resources, number of map slots as well as cluster size on a private cloud. This characterization helps illustrate the performance variation, 5x compared to 36x speedup, of Reduce-intensive and Map-intensive applications at effectively utilizing provisioned resources at different scales (1-64 VMs). By comparing the scalability efficiency, we clearly indicate the under-provisioning or over-provisioning of resources for different MapReduce applications at large scale.\",\"PeriodicalId\":288542,\"journal\":{\"name\":\"2014 IEEE 7th International Conference on Cloud Computing\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE 7th International Conference on Cloud Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLOUD.2014.68\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 7th International Conference on Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLOUD.2014.68","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

在本文中，我们描述了为各种MapReduce应用程序扩展已配置虚拟资源时性能变化的原因。Hadoop MapReduce简化了基于大数据的大规模批处理应用程序的开发和执行过程。然而，要以可承受的成本提供合适的资源以实现所需的性能，需要了解MapReduce的执行模型、可用于提供的资源以及手头应用程序的执行行为。作为实现这一过程自动化的第一步，我们描述了不同MapReduce应用程序在私有云上改变虚拟cpu和内存资源数量、映射槽数量以及集群大小时执行响应的差异。这个特征有助于说明在不同规模(1-64 vm)有效利用已配置资源时，reduce密集型和map密集型应用程序的性能差异(5倍与36倍的加速相比)。通过比较可伸缩性效率，我们清楚地指出了大规模不同MapReduce应用程序的资源供应不足或过度供应。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Performance Variations in Resource Scaling for MapReduce Applications on Private and Public Clouds

In this paper, we delineate the causes of performance variations when scaling provisioned virtual resources for a variety of MapReduce applications. Hadoop MapReduce facilitates the development and execution processes of large-scale batch applications on big data. However, provisioning suitable resources to achieve desired performance at an affordable cost requires expertise into the execution model of MapReduce, the resources available for provisioning and the execution behavior of the application at hand. As an initial step towards automating this process, we characterize the difference in execution response for different MapReduce applications while varying the number of virtualized CPUs and memory resources, number of map slots as well as cluster size on a private cloud. This characterization helps illustrate the performance variation, 5x compared to 36x speedup, of Reduce-intensive and Map-intensive applications at effectively utilizing provisioned resources at different scales (1-64 VMs). By comparing the scalability efficiency, we clearly indicate the under-provisioning or over-provisioning of resources for different MapReduce applications at large scale.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 IEEE 7th International Conference on Cloud Computing

自引率

0.00%

发文量