Modeling the Performance of the Hadoop Online Prototype

Emanuel Vianna, Giovanni V. Comarela, Tatiana Pontes, J. Almeida, Virgílio A. F. Almeida, K. Wilkinson, Harumi A. Kuno, U. Dayal
{"title":"Modeling the Performance of the Hadoop Online Prototype","authors":"Emanuel Vianna, Giovanni V. Comarela, Tatiana Pontes, J. Almeida, Virgílio A. F. Almeida, K. Wilkinson, Harumi A. Kuno, U. Dayal","doi":"10.1109/SBAC-PAD.2011.24","DOIUrl":null,"url":null,"abstract":"MapReduce is an important paradigm to support modern data-intensive applications. In this paper we address the challenge of modeling performance of one implementation of MapReduce called Hadoop Online Prototype (HOP), with a specific target on the intra-job pipeline parallelism. We use a hierarchical model that combines a precedence model and a queuing network model to capture the intra-job synchronization constraints. We first show how to build a precedence graph that represents the dependencies among multiple tasks of the same job. We then apply it jointly with an approximate Mean Value Analysis (aMVA) solution to predict mean job response time and resource utilization. We validate our solution against a queuing network simulator in various scenarios, finding that our performance model presents a close agreement, with maximum relative difference under 15%.","PeriodicalId":390734,"journal":{"name":"2011 23rd International Symposium on Computer Architecture and High Performance Computing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 23rd International Symposium on Computer Architecture and High Performance Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBAC-PAD.2011.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

MapReduce is an important paradigm to support modern data-intensive applications. In this paper we address the challenge of modeling performance of one implementation of MapReduce called Hadoop Online Prototype (HOP), with a specific target on the intra-job pipeline parallelism. We use a hierarchical model that combines a precedence model and a queuing network model to capture the intra-job synchronization constraints. We first show how to build a precedence graph that represents the dependencies among multiple tasks of the same job. We then apply it jointly with an approximate Mean Value Analysis (aMVA) solution to predict mean job response time and resource utilization. We validate our solution against a queuing network simulator in various scenarios, finding that our performance model presents a close agreement, with maximum relative difference under 15%.
Hadoop在线原型的性能建模
MapReduce是支持现代数据密集型应用的一个重要范例。在本文中,我们解决了MapReduce的一个实现,即Hadoop在线原型(HOP)的建模性能的挑战,并以任务内管道并行性为特定目标。我们使用结合了优先级模型和排队网络模型的分层模型来捕获作业内同步约束。我们首先展示如何构建一个优先级图,表示同一作业的多个任务之间的依赖关系。然后,我们将其与近似均值分析(aMVA)解决方案联合应用,以预测平均作业响应时间和资源利用率。我们针对排队网络模拟器在各种场景中验证了我们的解决方案,发现我们的性能模型非常一致,最大相对差异在15%以下。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信