2012 IEEE International Conference on Cluster Computing最新文献_第2页

Adjustable Credit Scheduling for High Performance Network Virtualization 面向高性能网络虚拟化的可调信用调度

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.27

Zhibo Chang, Jian Li, Ruhui Ma, Zhi-Jian Huang, Haibing Guan

{"title":"Adjustable Credit Scheduling for High Performance Network Virtualization","authors":"Zhibo Chang, Jian Li, Ruhui Ma, Zhi-Jian Huang, Haibing Guan","doi":"10.1109/CLUSTER.2012.27","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.27","url":null,"abstract":"Virtualization technology is now widely adopted in cloud computing to support heterogeneous and dynamic workload. The scheduler in a virtual machine monitor (VMM) plays an important role in allocating resources. However, the type of applications in virtual machines (VM) is unknown to the scheduler, and I/O-intensive and CPU-intensive applications are treated the same. This makes virtual systems unable to take full advantage of high performance networks such as 10-Gigabit Ethernet. In this paper, we review the SR-IOV networking solution and show by experiment that the current credit scheduler in Xen does not utilize high performance networks efficiently. For this reason, we propose a novel scheduling model with two optimizations to eliminate the bottleneck caused by scheduler. In this model, guest domains are divided into I/O-intensive domains and CPU-intensive domains according to their monitored behaviour. I/O-intensive domains can obtain extra credits that CPU-intensive domains are willing to share. Besides, the total available credits is adjusted agilely to accelerate the I/O responsiveness. Our experimental evaluation with benchmarks shows that the new scheduling model improves bandwidth even when the system's load is very high.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"78 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132432885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

HAaaS: Towards Highly Available Distributed Systems HAaaS:迈向高可用分布式系统

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.59

Yaoguang Wang, Weiming Lu, Bin-bin Yu, Baogang Wei

引用次数: 1

A New End-to-End Flow-Control Mechanism for High Performance Computing Clusters 一种新的高性能计算集群端到端流量控制机制

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.15

Javier Prades, F. Silla, J. Duato, H. Fröning, M. Nüssle

引用次数: 3

Evaluating Power-Monitoring Capabilities on IBM Blue Gene/P and Blue Gene/Q 评估IBM Blue Gene/P和Blue Gene/Q上的电源监控功能

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.62

Kazutomo Yoshii, K. Iskra, Rinku Gupta, P. Beckman, V. Vishwanath, Chenjie Yu, S. Coghlan

{"title":"Evaluating Power-Monitoring Capabilities on IBM Blue Gene/P and Blue Gene/Q","authors":"Kazutomo Yoshii, K. Iskra, Rinku Gupta, P. Beckman, V. Vishwanath, Chenjie Yu, S. Coghlan","doi":"10.1109/CLUSTER.2012.62","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.62","url":null,"abstract":"Power consumption is becoming a critical factor as we continue our quest toward exascale computing. Yet, actual power utilization of a complete system is an insufficiently studied research area. Estimating the power consumption of a large scale system is a nontrivial task because a large number of components are involved and because power requirements are affected by the (unpredictable) workloads. Clearly needed is a power-monitoring infrastructure that can provide timely and accurate feedback to system developers and application writers so that they can optimize the use of this precious resource. Many existing large-scale installations do feature power-monitoring sensors, however, those are part of environmental- and health monitoring sub systems and were not designed with application level power consumption measurements in mind. In this paper, we evaluate the existing power monitoring of IBM Blue Gene systems, with the goal of understanding what capabilities are available and how they fare with respect to spatial and temporal resolution, accuracy, latency, and other characteristics. We find that with a careful choice of dedicated micro benchmarks, we can obtain meaningful power consumption data even on Blue Gene/P, where the interval between available data points is measured in minutes. We next evaluate the monitoring subsystem on Blue Gene/Q, and are able to study the power characteristics of FPU and memory subsystems of Blue Gene/Q. We find the monitoring subsystem capable of providing second-scale resolution of power data conveniently separated between node components with seven seconds latency. This represents a significant improvement in power monitoring infrastructure, and hope future systems will enable real-time power measurement in order to better understand application behavior at a finer granularity.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114348977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27

Minimizing Network Contention in InfiniBand Clusters with a QoS-Aware Data-Staging Framework 基于qos感知的数据分级框架最小化ib集群中的网络争用

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.90

R. Rajachandrasekar, Jai Jaswani, H. Subramoni, D. Panda

{"title":"Minimizing Network Contention in InfiniBand Clusters with a QoS-Aware Data-Staging Framework","authors":"R. Rajachandrasekar, Jai Jaswani, H. Subramoni, D. Panda","doi":"10.1109/CLUSTER.2012.90","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.90","url":null,"abstract":"The rapid growth of supercomputing systems, both in scale and complexity, has been accompanied by degradation in system efficiencies. The sheer abundance of resources including millions of cores, vast amounts of physical memory and high-bandwidth networks are heavily under-utilized. This happens when the resources are time-shared amongst parallel applications that are scheduled to run on a subset of compute nodes in an exclusive manner. Several space-sharing techniques that have been proposed in the literature allow parallel applications to be co-located on compute nodes and share resources with each other. Although this leads to better system efficiencies, it also causes contention for system resources. In this work, we specifically address the problem of network contention, caused due to the sharing of network resources by parallel applications and file systems simultaneously. We leverage the Quality-of-Service (QoS) capabilities of the widely used Infini Band interconnect to enhance our data-staging file system, making it QoS-aware. This is a user-level framework that is agnostic of the file system and MPI implementation. Using this file system, we demonstrate the isolation of file system traffic from MPI communication traffic, thereby reducing the network contention. Experimental results show that MPI point-to-point latency can be reduced by up to 320 microseconds, and the bandwidth improved by up to 674MB/s in the presence of contention with I/O traffic. Furthermore, we were able to reduce the runtime of the AWP-ODC MPI application by about 9.89% in the presence of network contention, and also reduce the time spent in communication by the NAS CG kernel by 23.46%.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116282742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Transactional Multi-row Access Guarantee in the Key-Value Store 键值存储中的事务性多行访问保证

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.57

Yaoguang Wang, Weiming Lu, Baogang Wei

引用次数: 4

Enabling Fast, Noncontiguous GPU Data Movement in Hybrid MPI+GPU Environments 在混合MPI+GPU环境中实现快速，不连续的GPU数据移动

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.72

John Jenkins, James Dinan, P. Balaji, N. Samatova, R. Thakur

{"title":"Enabling Fast, Noncontiguous GPU Data Movement in Hybrid MPI+GPU Environments","authors":"John Jenkins, James Dinan, P. Balaji, N. Samatova, R. Thakur","doi":"10.1109/CLUSTER.2012.72","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.72","url":null,"abstract":"Lack of efficient and transparent interaction with GPU data in hybrid MPI+GPU environments challenges GPU acceleration of large-scale scientific computations. A particular challenge is the transfer of noncontiguous data to and from GPU memory. MPI implementations currently do not provide an efficient means of utilizing data types for noncontiguous communication of data in GPU memory. To address this gap, we present an MPI data type-processing system capable of efficiently processing arbitrary data types directly on the GPU. We present a means for converting conventional data type representations into a GPU-amenable format. Fine-grained, element-level parallelism is then utilized by a GPU kernel to perform in-device packing and unpacking of noncontiguous elements. We demonstrate a several-fold performance improvement for noncontiguous column vectors, 3D array slices, and 4D array sub volumes over CUDA-based alternatives. Compared with optimized, layout-specific implementations, our approach incurs low overhead, while enabling the packing of data types that do not have a direct CUDA equivalent. These improvements are demonstrated to translate to significant improvements in end-to-end, GPU-to-GPU communication time. In addition, we identify and evaluate communication patterns that may cause resource contention with packing operations, providing a baseline for adaptively selecting data-processing strategies.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128525362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 35

Mastiff: A MapReduce-based System for Time-Based Big Data Analytics Mastiff:基于mapreduce的基于时间的大数据分析系统

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.10

Sijie Guo, Jin Xiong, Weiping Wang, Rubao Lee

{"title":"Mastiff: A MapReduce-based System for Time-Based Big Data Analytics","authors":"Sijie Guo, Jin Xiong, Weiping Wang, Rubao Lee","doi":"10.1109/CLUSTER.2012.10","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.10","url":null,"abstract":"Existing MapReduce-based warehousing systems are not specially optimized for time-based big data analysis applications. Such applications have two characteristics: 1) data are continuously generated and are required to be stored persistently for a long period of time, 2) applications usually process data in some time period so that typical queries use time-related predicates. Time-based big data analytics requires both high data loading speed and high query execution performance. However, existing systems including current MapReduce-based solutions do not solve this problem well because the two requirements are contradictory. We have implemented a MapReduce-based system, called Mastiff, which provides a solution to achieve both high data loading speed and high query performance. Mastiff exploits a systematic combination of a column group store structure and a lightweight helper structure. Furthermore, Mastiff uses an optimized table scan method and a column-based query execution engine to boost query performance. Based on extensive experiments results with diverse workloads, we will show that Mastiff can significantly outperform existing systems including Hive, HadoopDB, and GridSQL.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129198827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

On the Effects of CPU Caches on MPI Point-to-Point Communications CPU缓存对MPI点对点通信的影响

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.22

Simone Pellegrini, T. Hoefler, T. Fahringer

引用次数: 7

Autotuning Stencil-Based Computations on GPUs gpu上基于模板的自动调优计算

2012 IEEE International Conference on Cluster Computing Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.46

A. Mametjanov, Daniel Lowell, Ching-Chen Ma, B. Norris

引用次数: 48