2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)最新文献_第4页

Cluster computing environment supporting single system image 支持单系统镜像的集群计算环境

2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392621

Min Choi, DaeWoo Lee, S. Maeng

{"title":"Cluster computing environment supporting single system image","authors":"Min Choi, DaeWoo Lee, S. Maeng","doi":"10.1109/CLUSTR.2004.1392621","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392621","url":null,"abstract":"Single system image (SSl) systems have been the mainstay of high-performance computing for many years. SSI requires the integration and aggregation of all types of resources in a cluster to present a single interface to users. We describe a cluster computing environment supporting SSI, constructed through three components: single process space (SPS), process migration, and dynamic load balancing. These components attempt to share all available resources in the cluster among all executing processes, so that the cluster operates like a single node with much more computing power. The most important goal is to combine these constructs in innovative ways for building cluster computing environment for SSI, as well as individually take an approach to improve performance or functionality. Our implementation of process migration has the capability of resolving broken pipe problems and bind errors on server socket reconstruction. We realize SPS based on block PID allocation. We also designed and implemented a dynamic load balancing scheme which resolves the limitations of our previous work by continuously tracing the job resource usage at runtime. The experimental results show that these three constructs for SSI clusters realized scalability, functionality and performance improvement. The cluster computing environment allows these constructs to cooperate implicitly so that they create a synergy effect at the SSI cluster system level and successfully provide a single system image to users and administrators.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"387 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132305709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A novel adaptive home migration protocol in home-based DSM 基于家庭的DSM中一种新的自适应家庭迁移协议

2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392619

W. Fang, Cho-Li Wang, Wenzhang Zhu, F. Lau

引用次数: 9

Dynamic page migration in software DSM systems 软件DSM系统中的动态页面迁移

2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392659

T. Repantis, C. Antonopoulos, V. Kalogeraki, T. Papatheodorou

{"title":"Dynamic page migration in software DSM systems","authors":"T. Repantis, C. Antonopoulos, V. Kalogeraki, T. Papatheodorou","doi":"10.1109/CLUSTR.2004.1392659","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392659","url":null,"abstract":"Dynamic page migration, when employed in distributed shared memory (DSM) systems offers several advantages: (i) reduces the latency of memory accesses, (ii) improves resource utilization by considering the computational and communicational needs of the applications and adapting to the changing resource availability, and (iii) achieves the above with lower overhead than traditional approaches that rely on thread migration. We propose a simple and efficient page migration mechanism that dynamically allocates shared memory pages to home nodes. Each page has a designated home node and nodes that heavily modify the pages can become their new homes. In our protocol, to avoid redundant page transfers, we perform migration only when the number of modifications of a page becomes larger than a threshold. The migration information is piggybacked on the existing synchronization messages to minimize the communication overhead. The migration decision is taken locally, at the home of each page. We have implemented our mechanism in the JIAJIA software DSM. Performance evaluation using real application benchmarks shows that our mechanism significantly reduces remote page modifications, improves memory access latencies, and achieves better performance than its competitors. We observe that the cost of executing the algorithm and of migrating the pages is amortized by the benefits gained.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122113140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Attaining higher performance in collective communication 在集体沟通中获得更高的绩效

2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392650

E. Chan, M. Heimlich, A. Purkayastha, R. V. D. Geijn

引用次数: 0

ClusterSim: a Java-based parallel discrete-event simulation tool for cluster computing ClusterSim:用于集群计算的基于java的并行离散事件模拟工具

2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392639

L.F.W. Goes, Luiz E. Ramos, C. Martins

引用次数: 20

Parallel I/O: lessons learnt in the last 20 years 并行I/O:过去20年的经验教训

2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392592

Toni Cortes

引用次数: 0

Flexible and dynamic control of network QoS in grid environments: the QoSINUS approach 网格环境下网络QoS的灵活动态控制:QoSINUS方法

2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392657

P. Primet, J. Montagnat, F. Chanussot

引用次数: 2

Towards provision of quality of service guarantees in job scheduling 为作业调度提供服务质量保证

2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392622

Mohammad Islam, P. Balaji, P. Sadayappan, D. Panda

引用次数: 28

Bandwidth-aware co-allocating meta-schedulers for mini-grid architectures 用于微型网格架构的带宽感知协同分配元调度器

2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392600

William M. Jones, Louis W. Pang, W. Ligon, D. Stanzione

引用次数: 22

Improving the performance of communication-intensive parallel applications executing on clusters 提高在集群上执行通信密集型并行应用程序的性能

2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392658

X. Qin, Hong Jiang

{"title":"Improving the performance of communication-intensive parallel applications executing on clusters","authors":"X. Qin, Hong Jiang","doi":"10.1109/CLUSTR.2004.1392658","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392658","url":null,"abstract":"Summary form only given. Clusters have emerged as a primary and cost-effective infrastructure for parallel applications, including communication-intensive applications that transfer a large amount of data among nodes of a cluster via the interconnection network. Conventional load balancers have been proven effective in increasing the utilization of CPU, memory, and disk I/O resources in a cluster. However, most of the existing load balancing schemes ignore network resources, leaving open the opportunity for significant performance bottleneck to form for communication-intensive parallel applications due to unevenly distributed communication load. To remedy this problem, we propose a communication-aware load balancing technique that is capable of improving the performance of communication-intensive applications by increasing the effective utilization of network resources in clusters. To facilitate the proposed load-balancing scheme, we introduce a behavior model for parallel applications with large requirements of CPU, memory, network, and disk 170 resources. The proposed load-balancing scheme can make full use of this model to quickly and accurately determine the load induced by a variety of parallel applications. Simulation results on executing a diverse set of both synthetic bulk synchronous and real parallel applications on a cluster show that the proposed scheme can significantly improve the performance both in slowdown and turn-around time over three existing schemes by up to 206% (with an average of 74%) and 235% (with an average of 82%), respectively.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132016511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1