在MapReduce设置中评估设计决策的模拟方法

Guanying Wang, A. Butt, P. Pandey, Karan Gupta
{"title":"在MapReduce设置中评估设计决策的模拟方法","authors":"Guanying Wang, A. Butt, P. Pandey, Karan Gupta","doi":"10.1109/MASCOT.2009.5366973","DOIUrl":null,"url":null,"abstract":"MapReduce has emerged as a model of choice for supporting modern data-intensive applications. The model is easy-to-use and promising in reducing time-to-solution. It is also a key enabler for cloud computing, which provides transparent and flexible access to a large number of compute, storage and networking resources. Setting up and operating a large MapReduce cluster entails careful evaluation of various design choices and run-time parameters to achieve high efficiency. However, this design space has not been explored in detail. In this paper, we adopt a simulation approach to systematically understanding the performance of MapReduce setups. The resulting simulator, MRPerf, captures such aspects of these setups as node, rack and network configurations, disk parameters and performance, data layout and application I/O characteristics, among others, and uses this information to predict expected application performance. Specifically, we use MRPerf to explore the effect of several component inter-connect topologies, data locality, and software and hardware failures on overall application performance. MR-Perf allows us to quantify the effect of these factors, and thus can serve as a tool for optimizing existing MapReduce setups as well as designing new ones.","PeriodicalId":275737,"journal":{"name":"2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"250","resultStr":"{\"title\":\"A simulation approach to evaluating design decisions in MapReduce setups\",\"authors\":\"Guanying Wang, A. Butt, P. Pandey, Karan Gupta\",\"doi\":\"10.1109/MASCOT.2009.5366973\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"MapReduce has emerged as a model of choice for supporting modern data-intensive applications. The model is easy-to-use and promising in reducing time-to-solution. It is also a key enabler for cloud computing, which provides transparent and flexible access to a large number of compute, storage and networking resources. Setting up and operating a large MapReduce cluster entails careful evaluation of various design choices and run-time parameters to achieve high efficiency. However, this design space has not been explored in detail. In this paper, we adopt a simulation approach to systematically understanding the performance of MapReduce setups. The resulting simulator, MRPerf, captures such aspects of these setups as node, rack and network configurations, disk parameters and performance, data layout and application I/O characteristics, among others, and uses this information to predict expected application performance. Specifically, we use MRPerf to explore the effect of several component inter-connect topologies, data locality, and software and hardware failures on overall application performance. MR-Perf allows us to quantify the effect of these factors, and thus can serve as a tool for optimizing existing MapReduce setups as well as designing new ones.\",\"PeriodicalId\":275737,\"journal\":{\"name\":\"2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"250\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MASCOT.2009.5366973\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MASCOT.2009.5366973","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 250

摘要

MapReduce已经成为支持现代数据密集型应用程序的首选模型。该模型易于使用,并且有望缩短求解时间。它也是云计算的关键推动者,云计算提供了对大量计算、存储和网络资源的透明和灵活的访问。设置和操作大型MapReduce集群需要仔细评估各种设计选择和运行时参数,以实现高效率。然而,这个设计空间还没有被详细探讨。在本文中,我们采用模拟方法来系统地理解MapReduce设置的性能。生成的模拟器MRPerf捕获这些设置的各个方面,如节点、机架和网络配置、磁盘参数和性能、数据布局和应用程序I/O特征等,并使用这些信息来预测预期的应用程序性能。具体来说,我们使用MRPerf来探索几种组件互连拓扑、数据局部性以及软件和硬件故障对整体应用程序性能的影响。MR-Perf允许我们量化这些因素的影响,因此可以作为优化现有MapReduce设置以及设计新设置的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A simulation approach to evaluating design decisions in MapReduce setups
MapReduce has emerged as a model of choice for supporting modern data-intensive applications. The model is easy-to-use and promising in reducing time-to-solution. It is also a key enabler for cloud computing, which provides transparent and flexible access to a large number of compute, storage and networking resources. Setting up and operating a large MapReduce cluster entails careful evaluation of various design choices and run-time parameters to achieve high efficiency. However, this design space has not been explored in detail. In this paper, we adopt a simulation approach to systematically understanding the performance of MapReduce setups. The resulting simulator, MRPerf, captures such aspects of these setups as node, rack and network configurations, disk parameters and performance, data layout and application I/O characteristics, among others, and uses this information to predict expected application performance. Specifically, we use MRPerf to explore the effect of several component inter-connect topologies, data locality, and software and hardware failures on overall application performance. MR-Perf allows us to quantify the effect of these factors, and thus can serve as a tool for optimizing existing MapReduce setups as well as designing new ones.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信