LaSA:用于Hadoop-MapReduce资源分配的位置感知调度算法

Tseng-Yi Chen, H. Wei, Ming-Feng Wei, Ying-Jie Chen, T. Hsu, W. Shih
{"title":"LaSA:用于Hadoop-MapReduce资源分配的位置感知调度算法","authors":"Tseng-Yi Chen, H. Wei, Ming-Feng Wei, Ying-Jie Chen, T. Hsu, W. Shih","doi":"10.1109/CTS.2013.6567252","DOIUrl":null,"url":null,"abstract":"Cloud computing has become more popular for a decade; it has been under continuous development with advances in architecture, software, and network. Hadoop-MapReduce is a common software framework processing parallelizable problem across big datasets using a distributed cluster of processors or stand-alone computers. Cloud Hadoop-MapReduce can scale incrementally in the number of processing nodes. Hence, the Hadoop-MapReduce is designed to provide a processing platform with powerful computation. Network traffic is always a most important bottleneck in data-intensive computing and network latency decreases significant performance in data parallel systems. Network bottleneck is caused by network bandwidth and the network speed is much slower than disk data access. So that, good data locality can reduces network traffic and increases performance in data-intensive HPC systems. However, Hadoop's scheduler has a defect of data locality in resource assignment. In this paper, we present a locality-aware scheduling algorithm (LaSA) for Hadoop-MapReduce scheduler. Firstly, we propose a mathematical model of weight of data interference in Hadoop scheduler. Secondly, we present the LaSA algorithm to use weight of data interference to provide data locality-aware resource assignment in Hadoop scheduler. Finally, we build an experimental environment with 3 cluster and 35 VMs to verify the LaSA's performance.","PeriodicalId":256633,"journal":{"name":"2013 International Conference on Collaboration Technologies and Systems (CTS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":"{\"title\":\"LaSA: A locality-aware scheduling algorithm for Hadoop-MapReduce resource assignment\",\"authors\":\"Tseng-Yi Chen, H. Wei, Ming-Feng Wei, Ying-Jie Chen, T. Hsu, W. Shih\",\"doi\":\"10.1109/CTS.2013.6567252\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cloud computing has become more popular for a decade; it has been under continuous development with advances in architecture, software, and network. Hadoop-MapReduce is a common software framework processing parallelizable problem across big datasets using a distributed cluster of processors or stand-alone computers. Cloud Hadoop-MapReduce can scale incrementally in the number of processing nodes. Hence, the Hadoop-MapReduce is designed to provide a processing platform with powerful computation. Network traffic is always a most important bottleneck in data-intensive computing and network latency decreases significant performance in data parallel systems. Network bottleneck is caused by network bandwidth and the network speed is much slower than disk data access. So that, good data locality can reduces network traffic and increases performance in data-intensive HPC systems. However, Hadoop's scheduler has a defect of data locality in resource assignment. In this paper, we present a locality-aware scheduling algorithm (LaSA) for Hadoop-MapReduce scheduler. Firstly, we propose a mathematical model of weight of data interference in Hadoop scheduler. Secondly, we present the LaSA algorithm to use weight of data interference to provide data locality-aware resource assignment in Hadoop scheduler. Finally, we build an experimental environment with 3 cluster and 35 VMs to verify the LaSA's performance.\",\"PeriodicalId\":256633,\"journal\":{\"name\":\"2013 International Conference on Collaboration Technologies and Systems (CTS)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"32\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 International Conference on Collaboration Technologies and Systems (CTS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CTS.2013.6567252\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Collaboration Technologies and Systems (CTS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CTS.2013.6567252","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32

摘要

云计算已经流行了十年;随着架构、软件和网络的进步,它一直在不断发展。Hadoop-MapReduce是一个通用的软件框架,使用分布式处理器集群或独立计算机处理跨大数据集的并行化问题。云Hadoop-MapReduce可以在处理节点的数量上进行增量扩展。因此,Hadoop-MapReduce被设计为提供一个具有强大计算能力的处理平台。网络流量一直是数据密集型计算中最重要的瓶颈,网络延迟严重降低了数据并行系统的性能。网络瓶颈是由网络带宽引起的,网络速度比磁盘数据访问慢得多。因此,在数据密集型HPC系统中,良好的数据局部性可以减少网络流量并提高性能。然而,Hadoop的调度程序在资源分配中存在数据局部性的缺陷。本文提出了一种用于Hadoop-MapReduce调度器的位置感知调度算法(LaSA)。首先,我们提出了Hadoop调度程序中数据干扰权的数学模型。其次,我们提出了LaSA算法,利用数据干扰权在Hadoop调度程序中提供数据位置感知的资源分配。最后,我们搭建了一个包含3个集群和35个vm的实验环境来验证LaSA的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
LaSA: A locality-aware scheduling algorithm for Hadoop-MapReduce resource assignment
Cloud computing has become more popular for a decade; it has been under continuous development with advances in architecture, software, and network. Hadoop-MapReduce is a common software framework processing parallelizable problem across big datasets using a distributed cluster of processors or stand-alone computers. Cloud Hadoop-MapReduce can scale incrementally in the number of processing nodes. Hence, the Hadoop-MapReduce is designed to provide a processing platform with powerful computation. Network traffic is always a most important bottleneck in data-intensive computing and network latency decreases significant performance in data parallel systems. Network bottleneck is caused by network bandwidth and the network speed is much slower than disk data access. So that, good data locality can reduces network traffic and increases performance in data-intensive HPC systems. However, Hadoop's scheduler has a defect of data locality in resource assignment. In this paper, we present a locality-aware scheduling algorithm (LaSA) for Hadoop-MapReduce scheduler. Firstly, we propose a mathematical model of weight of data interference in Hadoop scheduler. Secondly, we present the LaSA algorithm to use weight of data interference to provide data locality-aware resource assignment in Hadoop scheduler. Finally, we build an experimental environment with 3 cluster and 35 VMs to verify the LaSA's performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信