2014 5th International Workshop on Data-Intensive Computing in the Clouds最新文献

Skyport - Container-Based Execution Environment Management for Multi-cloud Scientific Workflows 用于多云科学工作流的基于容器的执行环境管理

2014 5th International Workshop on Data-Intensive Computing in the Clouds Pub Date : 2014-11-16 DOI: 10.1109/DataCloud.2014.6

Wolfgang Gerlach, Wei Tang, Kevin P. Keegan, Travis Harrison, Andreas Wilke, Jared Bischof, M. D'Souza, Scott Devoid, Daniel Murphy-Olson, N. Desai, Folker Meyer

引用次数: 87

To Overlap or Not to Overlap: Optimizing Incremental MapReduce Computations for On-Demand Data Upload 重叠或不重叠:优化按需数据上传的增量MapReduce计算

2014 5th International Workshop on Data-Intensive Computing in the Clouds Pub Date : 2014-11-16 DOI: 10.1109/DataCloud.2014.7

Stefan Ene, Bogdan Nicolae, Alexandru Costan, Gabriel Antoniu

引用次数: 5

Locality and Network-Aware Reduce Task Scheduling for Data-Intensive Applications 局部性和网络感知减少数据密集型应用程序的任务调度

2014 5th International Workshop on Data-Intensive Computing in the Clouds Pub Date : 2014-11-16 DOI: 10.1109/DataCloud.2014.10

Engin Arslan, Mrigank Shekhar, T. Kosar

{"title":"Locality and Network-Aware Reduce Task Scheduling for Data-Intensive Applications","authors":"Engin Arslan, Mrigank Shekhar, T. Kosar","doi":"10.1109/DataCloud.2014.10","DOIUrl":"https://doi.org/10.1109/DataCloud.2014.10","url":null,"abstract":"MapReduce is one of the leading programming frameworks to implement data-intensive applications by splitting the map and reduce tasks to distributed servers. Although there has been substantial amount of work on map task scheduling and optimization in the literature, the work on reduce task scheduling is very limited. Effective scheduling of the reduce tasks to the resources becomes especially important for the performance of data-intensive applications where large amounts of data are moved between the map and reduce tasks. In this paper, we propose a new algorithm (LoNARS) for reduce task scheduling, which takes both data locality and network traffic into consideration. Data locality awareness aims to schedule the reduce tasks closer to the map tasks to decrease the delay in data access as well as the amount of traffic pushed to the network. Network traffic awareness intends to distribute the traffic over the whole network and minimize the hotspots to reduce the effect of network congestion in data transfers. We have integrated LoNARS into Hadoop-1.2.1. Using our LoNARS algorithm, we achieved up to 15% gain in data shuffling time and up to 3-4% improvement in total job completion time compared to the other reduce task scheduling algorithms. Moreover, we reduced the amount of traffic on network switches by 15% which helps to save energy consumption considerably.","PeriodicalId":121831,"journal":{"name":"2014 5th International Workshop on Data-Intensive Computing in the Clouds","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115287693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Towards Scalable Distributed Graph Database Engine for Hybrid Clouds 面向混合云的可扩展分布式图数据库引擎

2014 5th International Workshop on Data-Intensive Computing in the Clouds Pub Date : 2014-11-16 DOI: 10.1109/DataCloud.2014.9

Miyuru Dayarathna, T. Suzumura

引用次数: 17

Integrating Pig with Harp to Support Iterative Applications with Fast Cache and Customized Communication 整合猪与竖琴，支持迭代应用与快速缓存和定制通信

2014 5th International Workshop on Data-Intensive Computing in the Clouds Pub Date : 2014-11-16 DOI: 10.1109/DataCloud.2014.8

T. Wu, A. Koppula, J. Qiu

引用次数: 5