2011 IEEE International Conference on Cluster Computing最新文献_第2页

Investigating Scenario-Conscious Asynchronous Rendezvous over RDMA 在RDMA上研究场景感知异步会合

2011 IEEE International Conference on Cluster Computing Pub Date : 2011-09-26 DOI: 10.1109/CLUSTER.2011.65

Judicael A. Zounmevo, A. Afsahi

引用次数: 3

EDO: Improving Read Performance for Scientific Applications through Elastic Data Organization EDO:通过弹性数据组织提高科学应用的读取性能

2011 IEEE International Conference on Cluster Computing Pub Date : 2011-09-26 DOI: 10.1109/CLUSTER.2011.18

Yuan Tian, S. Klasky, H. Abbasi, J. Lofstead, R. Grout, N. Podhorszki, Qing Liu, Yandong Wang, Weikuan Yu

{"title":"EDO: Improving Read Performance for Scientific Applications through Elastic Data Organization","authors":"Yuan Tian, S. Klasky, H. Abbasi, J. Lofstead, R. Grout, N. Podhorszki, Qing Liu, Yandong Wang, Weikuan Yu","doi":"10.1109/CLUSTER.2011.18","DOIUrl":"https://doi.org/10.1109/CLUSTER.2011.18","url":null,"abstract":"Large scale scientific applications are often bottlenecked due to the writing of checkpoint-restart data. Much work has been focused on improving their write performance. With the mounting needs of scientific discovery from these datasets, it is also important to provide good read performance for many common access patterns, which requires effective data organization. To address this issue, we introduce Elastic Data Organization (EDO), which can transparently enable different data organization strategies for scientific applications. Through its flexible data ordering algorithms, EDO harmonizes different access patterns with the underlying file system. Two levels of data ordering are introduced in EDO. One works at the level of data groups (a.k.a process groups). It uses Hilbert Space Filling Curves (SFC) to balance the distribution of data groups across storage targets. Another governs the ordering of data elements within a data group. It divides a data group into sub chunks and strikes a good balance between the size of sub chunks and the number of seek operations. Our experimental results demonstrate that EDO is able to achieve balanced data distribution across all dimensions and improve the read performance of multidimensional datasets in scientific applications.","PeriodicalId":200830,"journal":{"name":"2011 IEEE International Conference on Cluster Computing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116640214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 58

Automatic Hybrid OpenMP + MPI Program Generation for Dynamic Programming Problems 动态规划问题的自动混合OpenMP + MPI程序生成

2011 IEEE International Conference on Cluster Computing Pub Date : 2011-09-26 DOI: 10.1109/CLUSTER.2011.28

Dennis Vandenberg, Q. Stout

引用次数: 1

AA-Dedupe: An Application-Aware Source Deduplication Approach for Cloud Backup Services in the Personal Computing Environment AA-Dedupe:面向个人计算环境下云备份业务的基于应用感知的源重复数据删除方法

2011 IEEE International Conference on Cluster Computing Pub Date : 2011-09-26 DOI: 10.1109/CLUSTER.2011.20

Yinjin Fu, Hong Jiang, Nong Xiao, Lei Tian, F. Liu

{"title":"AA-Dedupe: An Application-Aware Source Deduplication Approach for Cloud Backup Services in the Personal Computing Environment","authors":"Yinjin Fu, Hong Jiang, Nong Xiao, Lei Tian, F. Liu","doi":"10.1109/CLUSTER.2011.20","DOIUrl":"https://doi.org/10.1109/CLUSTER.2011.20","url":null,"abstract":"The market for cloud backup services in the personal computing environment is growing due to large volumes of valuable personal and corporate data being stored on desktops, laptops and smart phones. Source deduplication has become a mainstay of cloud backup that saves network bandwidth and reduces storage space. However, there are two challenges facing deduplication for cloud backup service clients: (1) low deduplication efficiency due to a combination of the resource-intensive nature of deduplication and the limited system resources on the PC-based client site, and (2) low data transfer efficiency since post-deduplication data transfers from source to backup servers are typically very small but must often cross a WAN. In this paper, we present AA-Dedupe, an application-aware source deduplication scheme, to significantly reduce the computational overhead, increase the deduplication throughput and improve the data transfer efficiency. The AA-Dedupe approach is motivated by our key observations of the substantial differences among applications in data redundancy and deduplication characteristics, and thus is based on an application-aware index structure that effectively exploits this application awareness. Our experimental evaluations, based on an AA-Dedupe prototype implementation, show that our scheme can improve deduplication efficiency over the state-of-art source-deduplication methods by a factor of 2-7, resulting in shortened backup window, increased power-efficiency and reduced cost for cloud backup services.","PeriodicalId":200830,"journal":{"name":"2011 IEEE International Conference on Cluster Computing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126978214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 96

Design of HPC Node with Heterogeneous Processors 异构处理器HPC节点的设计

2011 IEEE International Conference on Cluster Computing Pub Date : 2011-09-26 DOI: 10.1109/CLUSTER.2011.23

Zheng Cao, Hongwei Tang, Qiang Li, Bo-Zhang Li, Fei Chen, Kai Wang, Xuejun An, Ninghui Sun

引用次数: 3

MPI Alltoall Personalized Exchange on GPGPU Clusters: Design Alternatives and Benefit GPGPU集群上的MPI全个性化交换:设计方案和收益

2011 IEEE International Conference on Cluster Computing Pub Date : 2011-09-26 DOI: 10.1109/CLUSTER.2011.67

A. Singh, S. Potluri, Hao Wang, K. Kandalla, S. Sur, D. Panda

{"title":"MPI Alltoall Personalized Exchange on GPGPU Clusters: Design Alternatives and Benefit","authors":"A. Singh, S. Potluri, Hao Wang, K. Kandalla, S. Sur, D. Panda","doi":"10.1109/CLUSTER.2011.67","DOIUrl":"https://doi.org/10.1109/CLUSTER.2011.67","url":null,"abstract":"General Purpose Graphics Processing Units (GPGPUs) are rapidly becoming an integral part of high performance system architectures. The Tianhe-1A and Tsubame systems received significant attention for their architectures that leverage GPGPUs. Increasingly many scientific applications that were originally written for CPUs using MPI for parallelism are being ported to these hybrid CPU-GPU clusters. In the traditional sense, CPUs perform computation while the MPI library takes care of communication. When computation is performed on GPGPUs, the data has to be moved from device memory to main memory before it can be used in communication. Though GPGPUs provide huge compute potential, the data movement to and from GPGPUs is both a performance and productivity bottleneck. Recently, the MVAPICH2 MPI library has been modified to directly support point-to-point MPI communication from the GPU memory [1]. Using this support, programmers do not need to explicitly move data to main memory before using MPI. This feature also enables performance improvement due to tight integration of GPU data movement and MPI internal protocols. Typically, scientific applications spend a significant portion of their execution time in collective communication. Hence, optimizing performance of collectives has a significant impact on their performance. MPI_Alltoall is a heavily used collective that has O(N2) communication, for N processes. In this paper, we outline the major design alternatives for MPI_Alltoall collective communication operation on GPGPU clusters. We propose three design alternatives and provide a corresponding performance analysis. Using our dynamic staging techniques, the latency of MPI_Alltoall on GPU clusters can be improved by 44% over a user level implementation and 31% over a send-recv based implementation for 256 KByte messages on 8 processes.","PeriodicalId":200830,"journal":{"name":"2011 IEEE International Conference on Cluster Computing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115163344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Large-Scale Simulator for Global Data Infrastructure Optimization 面向全局数据基础设施优化的大规模模拟器

2011 IEEE International Conference on Cluster Computing Pub Date : 2011-09-26 DOI: 10.1109/CLUSTER.2011.15

Sergio Herrero-Lopez, John R. Williams, Abel Sanchez

{"title":"Large-Scale Simulator for Global Data Infrastructure Optimization","authors":"Sergio Herrero-Lopez, John R. Williams, Abel Sanchez","doi":"10.1109/CLUSTER.2011.15","DOIUrl":"https://doi.org/10.1109/CLUSTER.2011.15","url":null,"abstract":"IT infrastructures in global corporations are appropriately compared with nervous systems, in which body parts (interconnected datacenters) exchange signals (request responses) in order to coordinate actions (data visualization and manipulation). A priori inoffensive perturbations in the operation of the system or the elements composing the infrastructure can lead to catastrophic consequences. Downtime disables the capability of clients reaching the latest versions of the data and/or propagating their individual contributions to other clients, potentially costing millions of dollars to the organization affected. The imperative need of guaranteeing the proper functioning of the system not only forces to pay particular attention to network outages, hot-objects or application defects, but also slows down the deployment of new capabilities, features and equipment upgrades. Under these circumstances, decision cycles for these modifications can be extremely conservative, and be prolonged for years, involving multiple authorities across departments of the organization. Frequently, the solutions adopted are years behind state-of-the art technologies or phased out compared to leading research on the IT infrastructure field. In this paper, the utilization of a large-scale data infrastructure simulator is proposed, in order to evaluate the impact of \" what if\" scenarios on the performance, availability and reliability of the system. The goal is to provide data center operators a tool that allows understanding and predicting the consequences of the deployment of new network topologies, hardware configurations or software applications in a global data infrastructure, without affecting the service. The simulator was constructed using a multi-layered approach, providing a granularity down to the individual server component and client action, and was validated against a downscaled version of the data infrastructure of a Fortune 500 company.","PeriodicalId":200830,"journal":{"name":"2011 IEEE International Conference on Cluster Computing","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133935657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Achieving Scalable Parallelization for the Hessenberg Factorization 实现海森伯格分解的可伸缩并行化

2011 IEEE International Conference on Cluster Computing Pub Date : 2011-09-26 DOI: 10.1109/CLUSTER.2011.16

A. Castaldo, R. Clint Whaley

引用次数: 9

Scheduling Workflows in Opportunistic Environments 在机会环境中调度工作流

2011 IEEE International Conference on Cluster Computing Pub Date : 2011-09-26 DOI: 10.1109/CLUSTER.2011.72

María M. López, E. Heymann, M. A. Senar

{"title":"Scheduling Workflows in Opportunistic Environments","authors":"María M. López, E. Heymann, M. A. Senar","doi":"10.1109/CLUSTER.2011.72","DOIUrl":"https://doi.org/10.1109/CLUSTER.2011.72","url":null,"abstract":"Workflow applications exhibit both high computation times and data transfer rates. For this reason, the completion time of the workflow is high. To reduce completion time, the tasks of a workflow ought to run on different machines interconnected by a network. Correct assignment of tasks to machines within the runtime environment is an important aspect in the completion time or make span. The manager making the assignment is the scheduler. The main problem of a static scheduler is that it ignores the changes that occur in the execution environment during DAG execution. To solve this problem, we developed a new dynamic scheduler. This dynamic scheduler monitors the behavior of the tasks executed as well as the execution environment, and it reacts to the changes detected by adapting the scheduling of the rest of pending tasks. The objective is to reduce the overhead incurred by excessive self-adaptations, without affecting the make span. To reduce overhead, the algorithm self-adapts only when an improvement in make span is expected. The proposed policies have been simulated and then executed in a real environment. These executions have achieved a reduction of the overhead of greater than 20%.","PeriodicalId":200830,"journal":{"name":"2011 IEEE International Conference on Cluster Computing","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121426112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Quartile and Outlier Detection on Heterogeneous Clusters Using Distributed Radix Sort 基于分布基数排序的异构聚类的四分位数和离群值检测

2011 IEEE International Conference on Cluster Computing Pub Date : 2011-09-26 DOI: 10.1109/CLUSTER.2011.53

Kyle Spafford, J. Meredith, J. Vetter

引用次数: 6