2012 IEEE International Conference on Cluster Computing最新文献_第3页

Characterization and Comparison of Cloud versus Grid Workloads 云与网格工作负载的特性和比较

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.35

S. Di, Derrick Kondo, W. Cirne

{"title":"Characterization and Comparison of Cloud versus Grid Workloads","authors":"S. Di, Derrick Kondo, W. Cirne","doi":"10.1109/CLUSTER.2012.35","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.35","url":null,"abstract":"A new era of Cloud Computing has emerged, but the characteristics of Cloud load in data centers is not perfectly clear. Yet this characterization is critical for the design of novel Cloud job and resource management systems. In this paper, we comprehensively characterize the job/task load and host load in a real-world production data center at Google Inc. We use a detailed trace of over 25 million tasks across over 12,500 hosts. We study the differences between a Google data center and other Grid/HPC systems, from the perspective of both work load (w.r.t. jobs and tasks) and host load (w.r.t. machines). In particular, we study the job length, job submission frequency, and the resource utilization of jobs in the different systems, and also investigate valuable statistics of machine's maximum load, queue state and relative usage levels, with different job priorities and resource attributes. We find that the Google data center exhibits finer resource allocation with respect to CPU and memory than that of Grid/HPC systems. Google jobs are always submitted with much higher frequency and they are much shorter than Grid jobs. As such, Google host load exhibits higher variance and noise.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"25 4-5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120980968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 159

A Job Scheduling Design for Visualization Services Using GPU Clusters 基于GPU集群的可视化服务作业调度设计

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.63

Wei-Hsien Hsu, Chun-Fu Wang, K. Ma, Hongfeng Yu, Jacqueline H. Chen

{"title":"A Job Scheduling Design for Visualization Services Using GPU Clusters","authors":"Wei-Hsien Hsu, Chun-Fu Wang, K. Ma, Hongfeng Yu, Jacqueline H. Chen","doi":"10.1109/CLUSTER.2012.63","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.63","url":null,"abstract":"Modern large-scale heterogeneous computers incorporating GPUs offer impressive processing capabilities. It is desirable to fully utilize such systems for serving multiple users concurrently to visualize large data at interactive rates. However, as the disparity between data transfer speed and compute speed continues to increase in heterogeneous systems, data locality becomes crucial for performance. We present a new job scheduling design to support multi-user exploration of large data in a heterogeneous computing environment, achieving near optimal data locality and minimizing I/O overhead. The targeted application is a parallel visualization system which allows multiple users to render large volumetric data sets in both interactive mode and batch mode. We present a cost model to assess the performance of parallel volume rendering and quantify the efficiency of job scheduling. We have tested our job scheduling scheme on two heterogeneous systems with different configurations. The largest test volume data used in our study has over two billion grid points. The timing results demonstrate that our design effectively improves data locality for complex multi-user job scheduling problems, leading to better overall performance of the service.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116608581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Improving Resource Utilization in MapReduce 提高MapReduce的资源利用率

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.69

Zhenhua Guo, G. Fox, Mo Zhou, Yang Ruan

{"title":"Improving Resource Utilization in MapReduce","authors":"Zhenhua Guo, G. Fox, Mo Zhou, Yang Ruan","doi":"10.1109/CLUSTER.2012.69","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.69","url":null,"abstract":"MapReduce has been adopted widely in both academia and industry to run large-scale data parallel applications. In MapReduce, each slave node hosts a number of task slots to which tasks can be assigned. So they limit the maximum number of tasks that can execute concurrently on each node. When all task slots of a node are not used, the resources “reserved” for idle slots are unutilized. To improve resource utilization, we propose resource stealing to enable running tasks to steal resources reserved for idle slots and give them back proportionally whenever new tasks are assigned. Resource stealing makes the otherwise wasted resources get fully utilized without interfering with normal job scheduling. MapReduce uses speculative execution to improve fault tolerance. Current Hadoop implementation decides whether to run speculative tasks based on the progress rates of running tasks, which does not take into consideration the absolute progress of each task. We propose Benefit Aware Speculative Execution which evaluates the potential benefit of speculative tasks and eliminates unnecessary runs. We implement the proposed algorithms in Hadoop, and our experiments show that our algorithms can significantly shorten job execution time and reduce the number of non-beneficial speculative tasks.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"330 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122743179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 42

Evaluation and Optimization of Breadth-First Search on NUMA Cluster NUMA聚类中广度优先搜索的评价与优化

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.29

Zehan Cui, Licheng Chen, Mingyu Chen, Yungang Bao, Yongbing Huang, Huiwei Lv

{"title":"Evaluation and Optimization of Breadth-First Search on NUMA Cluster","authors":"Zehan Cui, Licheng Chen, Mingyu Chen, Yungang Bao, Yongbing Huang, Huiwei Lv","doi":"10.1109/CLUSTER.2012.29","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.29","url":null,"abstract":"Graph is widely used in many areas. Breadth-First Search (BFS), a key subroutine for many graph analysis algorithms, has become the primary benchmark for Graph500 ranking. Due to the high communication cost of BFS, multi-socket nodes with large memory capacity (NUMA) are supposed to reduce network pressure. However, the longer latency to remote memory may cause problem if not treated well. In this work, we first demonstrate that simply spawning and binding one MPI process for each socket can achieve the best performance for MPI/OpenMP hybrid programmed BFS algorithm, resulting in 1.53X of performance on 16 nodes. Nevertheless, we notice that one MPI process per socket may exacerbate the communication cost. We propose to share some communication data structure among the processes inside the same node, to eliminate most of the intra-node communication. To fully utilize the network bandwidth, we make all the processes in a node to perform communication simultaneously. We further adjust the granularity of a key bitmap for better cache locality to speed up the computation. With all the optimizations for NUMA, communication and computation together, 2.44X of performance is achieved on 16 nodes, which is 39.2 Billion Traversed Edges per Second for an R-MAT graph of scale 32 (4 billion vertices and 64 billion edges).","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131281180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Adapting Irregular Computations to Large CPU-GPU Clusters in the MADNESS Framework 在MADNESS框架下适应大型CPU-GPU集群的不规则计算

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.42

Vlad Slavici, R. Varier, G. Cooperman, R. Harrison

引用次数: 3

ME2: Efficient Live Migration of Virtual Machine with Memory Exploration and Encoding ME2:基于内存探索和编码的虚拟机高效实时迁移

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.52

Yanqing Ma, Hongbo Wang, Jiankang Dong, Yangyang Li, Shiduan Cheng

引用次数: 30

sEBP: Event Based Polling for Efficient I/O Virtualization sEBP:高效I/O虚拟化的基于事件的轮询

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.50

Kun Tian, Yaozu Dong, Xiang Mi, Haibing Guan

引用次数: 4

Automated Load Balancing Invocation Based on Application Characteristics 基于应用特性的自动负载均衡调用

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.61

Harshitha Menon, Nikhil Jain, G. Zheng, L. Kalé

{"title":"Automated Load Balancing Invocation Based on Application Characteristics","authors":"Harshitha Menon, Nikhil Jain, G. Zheng, L. Kalé","doi":"10.1109/CLUSTER.2012.61","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.61","url":null,"abstract":"Performance of applications executed on large parallel systems suffer due to load imbalance. Load balancing is required to scale such applications to large systems. However, performing load balancing incurs a cost which may not be known a priori. In addition, application characteristics may change due to its dynamic nature and the parallel system used for execution. As a result, deciding when to balance the load to obtain the best performance is challenging. Existing approaches put this burden on the users, who rely on educated guess and extrapolation techniques to decide on a reasonable load balancing period, which may not be feasible and efficient. In this paper, we propose the Meta-Balancer framework which relieves the application programmers of deciding when to balance load. By continuously monitoring the application characteristics and using a set of guiding principles, Meta-Balancer invokes load balancing on its own without any prior application knowledge. We demonstrate that Meta-Balancer improves or matches the best performance that can be obtained by fine tuning periodic load balancing. We also show that in some cases Meta-Balancer improves performance by 18% whereas periodic load balancing gives only a 1.5% benefit.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"49 15","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113974267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

HerpRap: A Hybrid Array Architecture Providing Any Point-in-Time Data Tracking for Datacenter HerpRap:为数据中心提供任意时间点数据跟踪的混合阵列架构

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.19

Lingfang Zeng, D. Feng, Bo Mao, Jianxi Chen, Q. Wei, Wenguo Liu

引用次数: 6

eco-IDC: Trade Delay for Energy Cost with Service Delay Guarantee for Internet Data Centers 生态idc:互联网数据中心能源成本交易延迟与服务延迟保障

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.23

Jianying Luo, Lei Rao, Xue Liu

引用次数: 14