2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)最新文献

筛选
英文 中文
Cluster computing environment supporting single system image 支持单系统镜像的集群计算环境
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392621
Min Choi, DaeWoo Lee, S. Maeng
{"title":"Cluster computing environment supporting single system image","authors":"Min Choi, DaeWoo Lee, S. Maeng","doi":"10.1109/CLUSTR.2004.1392621","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392621","url":null,"abstract":"Single system image (SSl) systems have been the mainstay of high-performance computing for many years. SSI requires the integration and aggregation of all types of resources in a cluster to present a single interface to users. We describe a cluster computing environment supporting SSI, constructed through three components: single process space (SPS), process migration, and dynamic load balancing. These components attempt to share all available resources in the cluster among all executing processes, so that the cluster operates like a single node with much more computing power. The most important goal is to combine these constructs in innovative ways for building cluster computing environment for SSI, as well as individually take an approach to improve performance or functionality. Our implementation of process migration has the capability of resolving broken pipe problems and bind errors on server socket reconstruction. We realize SPS based on block PID allocation. We also designed and implemented a dynamic load balancing scheme which resolves the limitations of our previous work by continuously tracing the job resource usage at runtime. The experimental results show that these three constructs for SSI clusters realized scalability, functionality and performance improvement. The cluster computing environment allows these constructs to cooperate implicitly so that they create a synergy effect at the SSI cluster system level and successfully provide a single system image to users and administrators.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"387 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132305709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A novel adaptive home migration protocol in home-based DSM 基于家庭的DSM中一种新的自适应家庭迁移协议
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392619
W. Fang, Cho-Li Wang, Wenzhang Zhu, F. Lau
{"title":"A novel adaptive home migration protocol in home-based DSM","authors":"W. Fang, Cho-Li Wang, Wenzhang Zhu, F. Lau","doi":"10.1109/CLUSTR.2004.1392619","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392619","url":null,"abstract":"Home migration is used to tackle the home assignment problem in home-based software distributed shared memory systems. We propose an adaptive home migration protocol to optimize the single-writer pattern which occurs frequently in distributed applications. Our approach is unique in its use of a per-object threshold which is continuously adjusted to facilitate home migration decisions. This adaptive threshold is monotonously decreasing with increased likelihood that a particular object exhibits a lasting single-writer pattern. The threshold is tuned according to the feedback of previous home migration decisions at runtime. We implement this adaptive home migration protocol in a distributed Java virtual machine that supports truly parallel execution of multithreaded Java applications on clusters. The analysis and the experiments show that our home migration protocol demonstrates both the sensitivity to the lasting single-writer pattern and the robustness against the transient single-writer pattern. In the latter case, the protocol inhibits home migration in order to reduce the home redirection overhead.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131305311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Dynamic page migration in software DSM systems 软件DSM系统中的动态页面迁移
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392659
T. Repantis, C. Antonopoulos, V. Kalogeraki, T. Papatheodorou
{"title":"Dynamic page migration in software DSM systems","authors":"T. Repantis, C. Antonopoulos, V. Kalogeraki, T. Papatheodorou","doi":"10.1109/CLUSTR.2004.1392659","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392659","url":null,"abstract":"Dynamic page migration, when employed in distributed shared memory (DSM) systems offers several advantages: (i) reduces the latency of memory accesses, (ii) improves resource utilization by considering the computational and communicational needs of the applications and adapting to the changing resource availability, and (iii) achieves the above with lower overhead than traditional approaches that rely on thread migration. We propose a simple and efficient page migration mechanism that dynamically allocates shared memory pages to home nodes. Each page has a designated home node and nodes that heavily modify the pages can become their new homes. In our protocol, to avoid redundant page transfers, we perform migration only when the number of modifications of a page becomes larger than a threshold. The migration information is piggybacked on the existing synchronization messages to minimize the communication overhead. The migration decision is taken locally, at the home of each page. We have implemented our mechanism in the JIAJIA software DSM. Performance evaluation using real application benchmarks shows that our mechanism significantly reduces remote page modifications, improves memory access latencies, and achieves better performance than its competitors. We observe that the cost of executing the algorithm and of migrating the pages is amortized by the benefits gained.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122113140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Attaining higher performance in collective communication 在集体沟通中获得更高的绩效
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392650
E. Chan, M. Heimlich, A. Purkayastha, R. V. D. Geijn
{"title":"Attaining higher performance in collective communication","authors":"E. Chan, M. Heimlich, A. Purkayastha, R. V. D. Geijn","doi":"10.1109/CLUSTR.2004.1392650","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392650","url":null,"abstract":"Summary form only given. It has long been thought that research into collective communication algorithms on distributed-memory parallel computers has been exhausted. This project demonstrates that the implementations available as part of widely-used libraries are suboptimal. We demonstrate this through the implementation of the \"reduce-scatter\" collective communication and comparison with the MPICH implementation of MPI. Performance on a large cluster is reported.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123737178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ClusterSim: a Java-based parallel discrete-event simulation tool for cluster computing ClusterSim:用于集群计算的基于java的并行离散事件模拟工具
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392639
L.F.W. Goes, Luiz E. Ramos, C. Martins
{"title":"ClusterSim: a Java-based parallel discrete-event simulation tool for cluster computing","authors":"L.F.W. Goes, Luiz E. Ramos, C. Martins","doi":"10.1109/CLUSTR.2004.1392639","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392639","url":null,"abstract":"We present the proposal and implementation of a Java-based parallel discrete-event simulation tool for cluster computing called ClusterSim (cluster simulation tool). The ClusterSim supports visual modeling and simulation of clusters and their workloads for performance analysis. A cluster is composed of single or multiprocessed nodes, parallel job schedulers, network topologies and technologies. A workload is represented by users that submit jobs composed of tasks described by probability' distributions and their internal structure (CPU, I/O and MPI instructions). Our main objectives in This work: to present the proposal and implementations of the software architecture and simulation model of ClusterSim; to verify and validate ClusterSim; to analyze ClusterSim by means of a case study. Our main contributions are: the proposal and implementation of ClusterSim with an hybrid workload model, a graphical environment, the modeling of heterogeneous clusters and a statistical and performance module.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127510762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Parallel I/O: lessons learnt in the last 20 years 并行I/O:过去20年的经验教训
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392592
Toni Cortes
{"title":"Parallel I/O: lessons learnt in the last 20 years","authors":"Toni Cortes","doi":"10.1109/CLUSTR.2004.1392592","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392592","url":null,"abstract":"Summary form only given. After these two decades, it is now a good time to go through all the done work and try to learn the important lessons all these parallel I/O initiatives have taught us. This work aims at giving this global overview. The focus is not on commercial/academic systems/prototypes, but on the concepts that lay behind them. These concepts have normally been applied at different levels, and thus, such an overview can be of interest to many people ranging from the hardware design to the application implementation. Some of the most important concepts that are discussed are, among others, data placement (RAIDs, 2D and 3D files, ...), network architectures for parallel I/O (Network attached devices, SAN, ...), parallel caching and prefetching (cooperative caching, Informed caching and prefetching, ...), and interfaces (collective I/O, data distribution interfaces, ...).","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127596651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flexible and dynamic control of network QoS in grid environments: the QoSINUS approach 网格环境下网络QoS的灵活动态控制:QoSINUS方法
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392657
P. Primet, J. Montagnat, F. Chanussot
{"title":"Flexible and dynamic control of network QoS in grid environments: the QoSINUS approach","authors":"P. Primet, J. Montagnat, F. Chanussot","doi":"10.1109/CLUSTR.2004.1392657","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392657","url":null,"abstract":"Grids rely on a complex interconnection of IP domains that may exhibit changing performance characteristics and may offer different quality of service (QoS) facilities. We examine the case of a biomedical application distributed over a grid and show how it may suffer from uncontrolled communication performance. Then we present the QoSINUS service that dynamically allocates the network resources to Grid flows in order to match their specific QoS requirements under different load conditions. The aim of this approach is to optimize the end to end performances the heterogeneous mix of grid flows gets from the network to enhance the individual application's performance as the overall grid infrastructure performance and utilization level. The QoSINUS service is based on the programmable network approach that offers flexibility, evolutivity and enables dynamic adaptation to network load variations. Finally results of QoSINUS experiments conducted in the context of the eToile french grid testbed based on the high speed and DiffServ capable research network infrastructure, VTHD, are presented.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121365931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards provision of quality of service guarantees in job scheduling 为作业调度提供服务质量保证
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392622
Mohammad Islam, P. Balaji, P. Sadayappan, D. Panda
{"title":"Towards provision of quality of service guarantees in job scheduling","authors":"Mohammad Islam, P. Balaji, P. Sadayappan, D. Panda","doi":"10.1109/CLUSTR.2004.1392622","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392622","url":null,"abstract":"Considerable research has focused on the problem of scheduling dynamically arriving independent parallel jobs on a given set of resources. There has also been some recent work in the direction of providing differentiated service to different classes of jobs using statically or dynamically calculated priorities assigned to the jobs. However, the potential and usability of a quality of service based scheme has not been much studied. In This work, we extend a previously proposed scheme (QoPS) to provide quality of service to submitted jobs; we propose extensions to the algorithm in multiple aspects: (i) studying the effect of user tolerance towards missed deadlines on the overall profit attainable by the supercomputer center, (it) providing artificial slack to some jobs to maximize the overall profit and (hi) utilizing a kill-and-restart mechanism to further improve the profit attainable.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122423249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Bandwidth-aware co-allocating meta-schedulers for mini-grid architectures 用于微型网格架构的带宽感知协同分配元调度器
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392600
William M. Jones, Louis W. Pang, W. Ligon, D. Stanzione
{"title":"Bandwidth-aware co-allocating meta-schedulers for mini-grid architectures","authors":"William M. Jones, Louis W. Pang, W. Ligon, D. Stanzione","doi":"10.1109/CLUSTR.2004.1392600","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392600","url":null,"abstract":"The interaction of simultaneously co-allocated jobs can often create contention in the network infrastructure of a dedicated computational grid. This contention can lead to degraded job run-time performance. We present several bandwidth-aware co-allocating meta-schedulers. These schedulers take into account inter-cluster network utilization as a means by which to mitigate this impact. We make use of a bandwidth-centric parallel job communication model that captures the time-varying utilization of shared inter-cluster network resources. By doing so, we are able to evaluate the performance of grid scheduling algorithms that focus not only on node resource allocation, but also on shared inter-cluster network bandwidth.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115755307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Improving the performance of communication-intensive parallel applications executing on clusters 提高在集群上执行通信密集型并行应用程序的性能
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935) Pub Date : 2004-09-20 DOI: 10.1109/CLUSTR.2004.1392658
X. Qin, Hong Jiang
{"title":"Improving the performance of communication-intensive parallel applications executing on clusters","authors":"X. Qin, Hong Jiang","doi":"10.1109/CLUSTR.2004.1392658","DOIUrl":"https://doi.org/10.1109/CLUSTR.2004.1392658","url":null,"abstract":"Summary form only given. Clusters have emerged as a primary and cost-effective infrastructure for parallel applications, including communication-intensive applications that transfer a large amount of data among nodes of a cluster via the interconnection network. Conventional load balancers have been proven effective in increasing the utilization of CPU, memory, and disk I/O resources in a cluster. However, most of the existing load balancing schemes ignore network resources, leaving open the opportunity for significant performance bottleneck to form for communication-intensive parallel applications due to unevenly distributed communication load. To remedy this problem, we propose a communication-aware load balancing technique that is capable of improving the performance of communication-intensive applications by increasing the effective utilization of network resources in clusters. To facilitate the proposed load-balancing scheme, we introduce a behavior model for parallel applications with large requirements of CPU, memory, network, and disk 170 resources. The proposed load-balancing scheme can make full use of this model to quickly and accurately determine the load induced by a variety of parallel applications. Simulation results on executing a diverse set of both synthetic bulk synchronous and real parallel applications on a cluster show that the proposed scheme can significantly improve the performance both in slowdown and turn-around time over three existing schemes by up to 206% (with an average of 74%) and 235% (with an average of 82%), respectively.","PeriodicalId":123512,"journal":{"name":"2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132016511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信