2005 IEEE International Conference on Cluster Computing最新文献

Harnessing Shared Wide-area Clusters for Dynamic High End Services 利用共享广域集群实现动态高端服务

2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-27 DOI: 10.1109/CLUSTR.2005.347073

Ramesh Viswanath, M. Ahamad, Karsten Schwan Georgia

{"title":"Harnessing Shared Wide-area Clusters for Dynamic High End Services","authors":"Ramesh Viswanath, M. Ahamad, Karsten Schwan Georgia","doi":"10.1109/CLUSTR.2005.347073","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347073","url":null,"abstract":"Current trends in distributed computing have been moving towards the use of wide-area clusters that are managed by different entities. In this paper, we introduce middleware-level support to facilitate computational resource sharing with service guarantees using non-dedicated server systems in wide-area clusters. The aim is to ensure that sets of computational tasks submitted to such high end systems are completed reliably and in a timely fashion. Our approach develops methods that enhance basic job scheduling with information about the execution history and trust values for the computational nodes to which jobs are assigned. In essence, job scheduling is enriched with trust models constructed and maintained at runtime, and scheduling decisions are based on metrics that capture trust in remote server systems. An implementation of the approach is evaluated on Planetlab, with initial results demonstrating good success rates in completing jobs within their specific service level agreements, including under conditions of high system loads. Additional results are attained with a variant of the scheduling algorithm that uses redundancy to further improve the likelihood of meeting end user SLAs. A representative application considered in this paper is remote data visualization, where substantial computation must be applied to data before displaying it to end users. SLAs capture desired end-to-end delay, and distributed server or cluster systems are used to perform the required computations in a timely manner","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125931486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Online Critical Path Profiling for Parallel Applications 并行应用的在线关键路径分析

2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-27 DOI: 10.1109/CLUSTR.2005.347048

Wenbin Zhu, P. Bridges, A. Maccabe

引用次数: 6

RNIC-PI: The last step in standardizing RDMA RNIC-PI: RDMA标准化的最后一步

2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-27 DOI: 10.1109/CLUSTR.2005.347030

Ramesh VelurEunni

引用次数: 0

Parallel Out-of-Core Matlab for Extreme Virtual Memory 极限虚拟内存并行外核Matlab

2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-27 DOI: 10.1109/CLUSTR.2005.347016

Hahn Kim, J. Kepner, C. Kahn

{"title":"Parallel Out-of-Core Matlab for Extreme Virtual Memory","authors":"Hahn Kim, J. Kepner, C. Kahn","doi":"10.1109/CLUSTR.2005.347016","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347016","url":null,"abstract":"Summary form only given. Large data sets that cannot fit in memory can be addressed with out-of-core methods, which use memory as a \"window \" to view a section of the data stored on disk at a time. The parallel Matlab for eXtreme virtual memory (pMatlab XVM) library adds out-of-core extensions to the parallel Matlab (pMatlab) library. We have applied pMatlab XVM to the DARPA high productivity computing systems' HPCchallenge FFT benchmark. The benchmark was run using several different implementations: C+MPI, pMatlab, pMatlab hand coded for out-of-core and pMatlab XVM. These experiments found 1) the performance of the C+MPI and pMatlab versions were comparable; 2) the out-of-core versions deliver 80% of the performance of the in-core versions; 3) the out-of-core versions were able to perform a 1 terabyte (64 billion point) FFT and 4) the pMatlab XVM program was smaller, easier to implement and verify, and more efficient than its hand coded equivalent. We are transitioning this technology to several DoD signal processing applications and plan to apply pMatlab XVM to the full HPCchallenge benchmark suite. Using next generation hardware, problems sizes a factor of 100 to 1000 times larger should be feasible","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130749406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Adaptive Management of a Utility Computing 效用计算的自适应管理

2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-27 DOI: 10.1109/CLUSTR.2005.347012

Yingyin Jiang, Dan Meng, Yi Liang, Danjun Liu, Jianfeng Zhan

引用次数: 1

Search-based Job Scheduling for Parallel Computer Workloads 并行计算机工作负载的基于搜索的作业调度

2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-27 DOI: 10.1109/CLUSTR.2005.347037

S. Vasupongayya, S. Chiang, Barton C. Massey

引用次数: 14

The ELIHE High-Performance Cluster ELIHE高性能集群

2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-27 DOI: 10.1109/CLUSTR.2005.347091

Violeta Holmes, Terence McDonough

{"title":"The ELIHE High-Performance Cluster","authors":"Violeta Holmes, Terence McDonough","doi":"10.1109/CLUSTR.2005.347091","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347091","url":null,"abstract":"Summary form only given. In this poster, we present our experience in implementing a high performance computing cluster for teaching parallel computing theory and development of parallel applications. In teaching parallel and high performance computing, there is often a gap between potential performance taught in the lectures and those practically experienced in exercises in the laboratory. The development of the ELIHE cluster provides us with an opportunity take a hands on approach in teaching programming environments, tools, and libraries for development of parallel applications, parallel computation, architectures, message passing and shared memory paradigms using MPI and OpenMP, etc, at both undergraduate and graduate level. The ELIHE HP cluster consists of 9 computational nodes and a master node. All the nodes in the cluster are commodity systems - PCs, running commodity software - Linux, and CLIC Mandrake. Creating the ELIHE cluster has fulfilled two important goals: to design and implement a HP cluster for teaching parallel computing architectures in the School of Science and Technology, and to promote the use of high performance computer technology for research to faculty members and students","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117212508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Grid and Cluster Matrix Computation with Persistent Storage and Out-of-core Programming 网格和聚类矩阵计算与持久存储和核外编程

2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-27 DOI: 10.1109/CLUSTR.2005.347071

Lamine M. Aouad, S. Petiton, M. Sato

引用次数: 11

Registration and Resource Allocation Mechanisms in High-Performance Application Frameworks 高性能应用程序框架中的注册和资源分配机制

2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-27 DOI: 10.1109/CLUSTR.2005.347084

O. Volberg, J. Larson, R. Jacob, J. Michalakes

{"title":"Registration and Resource Allocation Mechanisms in High-Performance Application Frameworks","authors":"O. Volberg, J. Larson, R. Jacob, J. Michalakes","doi":"10.1109/CLUSTR.2005.347084","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347084","url":null,"abstract":"Summary form only given. Commodity clusters have enabled ambitious multiphysics or coupled modeling of complex, mutually interacting, computationally intensive systems in science and engineering. Each individual sub-system is represented as a component with its own parallel processor layout and requirements for temporal advance. A central challenge in developing such systems is the parallel coupling problem, which involves overall system architecture and the automation of component registration, distribution of the processor pool between individual components, parallel data transfer and transformation. There currently exist efficient mechanisms for automating parallel data transfer and transformation such as MCT and MPCCI. Mechanisms for top-level system integration, including component registration and resource allocation, scheduling, and control at runtime are less mature and face even greater challenges in heterogeneous environments. We will discuss the numerous architectural choices faced in framework and parallel coupled application development, and will illustrate them through a comparison of these mechanisms in four scientific application frameworks: the community climate system model, the space weather modeling framework, the earth system modeling framework, and the weather research and forecasting model. We will then discuss a more sophisticated set of requirements for automating these functions in application frameworks for heterogeneous clusters and computational grids","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132621247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Cost/Benefit Estimating Service for Mapping Parallel Applications on Heterogeneous Clusters 异构集群上并行应用映射的成本/收益评估服务

2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-27 DOI: 10.1109/CLUSTR.2005.347062

D. Katramatos, S. Chapin

{"title":"A Cost/Benefit Estimating Service for Mapping Parallel Applications on Heterogeneous Clusters","authors":"D. Katramatos, S. Chapin","doi":"10.1109/CLUSTR.2005.347062","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347062","url":null,"abstract":"Matching the resource requirements of a parallel application to the available resources of a large, heterogeneous cluster is a key requirement in effectively scheduling the application tasks on the nodes of the cluster. This paper describes the cost/benefit estimating service (CBES), a runtime scheduling system targeted at finding highly effective schedules (or mappings) of tasks on nodes. CBES relies on its own infrastructure to gather and maintain static and dynamic information profiles for the computing system and the applications of interest. At the core of CBES is a mapping evaluation module which evaluates candidate application mappings on the basis of shortest execution times. By default, CBES uses a simulated-annealing based scheduler to select mappings. The paper presents the design, initial implementation, and test results of CBES on the Centurion cluster at the University of Virginia and the Orange Grove cluster at Syracuse University. These tests demonstrated that the exploitation of internode communication speed differences due to network heterogeneity can yield speedups of over 10% between same architecture nodes. The maximum observed speedup across architectures for the best vs. worst mapping scenarios of the same application was over 36%, while the corresponding average case speedup was approximately 30%","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117187701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7