2005 IEEE International Conference on Cluster Computing最新文献

筛选
英文 中文
Exploiting NIC Memory for Improving Cluster-Based Webserver Performance 利用网卡内存提高基于集群的web服务器性能
2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-01 DOI: 10.1109/CLUSTR.2005.347067
G. S. Choi, Jin-Ha Kim, D. Ersoz, Mazin S. Yousif, C. Das
{"title":"Exploiting NIC Memory for Improving Cluster-Based Webserver Performance","authors":"G. S. Choi, Jin-Ha Kim, D. Ersoz, Mazin S. Yousif, C. Das","doi":"10.1109/CLUSTR.2005.347067","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347067","url":null,"abstract":"Improving the performance of Web servers has become a critical issue to handle the increasing demand on various network-based services. In this context, we exploit the local memory of programmable network interface cards (NICs) to improve the performance of cluster-based Web servers, which are increasingly used in designing Web server platforms. We use the NIC memory for caching recently accessed data blocks to improve server performance. We have implemented a prototype of the proposed NIC caching mechanism for a distributed Web server, based on PRESS (Carrera et al., 2002), on an 8-node, Myrinet-connected Linux cluster. Measurements with several server workloads show that NIC caching can enhance throughput by up to 27% compared to the original PRESS Web server without NIC caching, by minimizing the DMA and PCI bus overhead","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"385 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133899143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The SMASH Impacts to Cluster Computing SMASH对集群计算的影响
2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-01 DOI: 10.1109/CLUSTR.2005.347081
Yung-Chin Fang, J. Hsieh
{"title":"The SMASH Impacts to Cluster Computing","authors":"Yung-Chin Fang, J. Hsieh","doi":"10.1109/CLUSTR.2005.347081","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347081","url":null,"abstract":"Summary form only given. High performance computing clusters scaling out fact indicates manageability will become more important than ever. Over time, a computer center tends facilitate multiple management frameworks from vendors to remote manage generations of heterogeneous HPC clusters to complete one task. The heterogeneous and scaling out computing info structure made HPCC/grid administration even more challenging and time consuming than before. Management interoperability is usually compromised or absent due to the heterogeneous environment. In order to solve this problem for the long run and further reduce the total cost of ownership, industry is defining the systems management architecture for server hardware (SMASH) initiative. The SMASH initiative is a suite of specifications, which standardize management interfaces and remote management architecture for heterogeneous computing environments. The suite of specifications includes unified command line protocol, resource discovery, and resource addressing and data model profiles. SMASH not only addresses complicated administration challenges as well as enables hardware independent remote manageability plus computing info structure status/performance aware job scheduling schemes and as a result, will bring HPC clusters/grid utilization rates to an even higher level. This poster uses figures to illustrate the challenges, corresponding SMASH specifications and point out the potential research directions in supercomputing space over SMASH implementations","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133508346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Minimizing the Network Overhead of Checkpointing in Cycle-harvesting Cluster Environments 最小化循环收集集群环境中检查点的网络开销
2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-01 DOI: 10.1109/CLUSTR.2005.347074
Daniel Nurmi, J. Brevik, R. Wolski
{"title":"Minimizing the Network Overhead of Checkpointing in Cycle-harvesting Cluster Environments","authors":"Daniel Nurmi, J. Brevik, R. Wolski","doi":"10.1109/CLUSTR.2005.347074","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347074","url":null,"abstract":"Cycle-harvesting systems such as Condor have been developed to make desktop machines in a local area (which are often similar to clusters in hardware configuration) available as a compute platform. To provide a dual-use capability, opportunistic jobs harvesting cycles from the desktop must be checkpointed before the desktop resources are reclaimed by their owners and the job is evacuated. In this paper, we investigate a new system for computing efficient checkpoint schedules in cycle-harvesting environments. Our system records the historical availability from each resource and fits a statistical model to the observations. Because checkpointing must often traverse the network (i.e. the desktop hosts do not provide sufficient persistent storage for checkpoints), we combine this model with predictions of network performance to the storage site to compute a checkpoint schedule. When an application is initiated on a particular resource, the system uses the computed distribution to parameterize a Markov state-transition model for the application's execution, evaluates the expected time and network overhead as a function of the checkpoint interval, and numerically optimizes with respect to time. We report on the performance of and implementation of this system using the Condor cycle-harvesting environment at the University of Wisconsin. We also evaluate the efficiencies we achieve for a variety of network overheads using trace-based simulation. Finally, we validate our simulations against the observed performance with Condor. Our results indicate that while the choice of model distribution has a relatively small but positive effect on time efficiency, it has a substantial impact on network utilization","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117119310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Modeling Protocol Offload for Message-oriented Communication 面向消息通信的建模协议卸载
2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-01 DOI: 10.1109/CLUSTR.2005.347069
Patricia Gilfeather, A. Maccabe
{"title":"Modeling Protocol Offload for Message-oriented Communication","authors":"Patricia Gilfeather, A. Maccabe","doi":"10.1109/CLUSTR.2005.347069","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347069","url":null,"abstract":"In this paper, we present a new, conceptual model that captures the benefits of protocol offload in the context of high performance computing systems. In contrast to the LAWS model, the extensible message-oriented offload model (EMO) emphasizes communication in terms of messages rather than flows. In contrast to the LogP model, EMO emphasizes the performance of the network protocol rather than the parallel algorithm. The extensible message-oriented offload model allows protocol developers to consider the tradeoffs and specifics associated with offloading protocol processing including the reduction in message latency along with benefits associated with reduction in overhead and improvements to throughput. We give an overview of the EMO model and show how our model can be mapped to the LAWS and LogP models. We also show how it can be used to analyze individual messages within TCP flows by contrasting full offload (TCP offload engines) with other approaches, e.g., interrupt coalescing and splintered TCP","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125810807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Efficient and Robust Computation of Resource Clusters in the Internet 互联网中资源集群的高效鲁棒计算
2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-01 DOI: 10.1109/CLUSTR.2005.347046
Chuang Liu, Ian T Foster
{"title":"Efficient and Robust Computation of Resource Clusters in the Internet","authors":"Chuang Liu, Ian T Foster","doi":"10.1109/CLUSTR.2005.347046","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347046","url":null,"abstract":"Applications such as parallel computing, online games, and content distribution networks need to run on a set of resources with particular network connection characteristics to get good performance. We present an efficient heuristic algorithm to find a set of resources with the property that the network latency between any pair of those resources is less (or more) than a given value in the Internet. Our algorithm proceeds in two phases: (1) we use a network flow technique to partition resources into clusters based on end-to-end network latency such that resources in a cluster have much smaller latency with each other than with other resource; then (2) we search for required resources in these clusters. We evaluate this method in a large distributed Internet environment, PlanetLab, and show that our method can improve the performance of current search algorithms remarkably. We also show that our method is robust despite incomplete and noisy latency measurement data","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128459329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An integrated Retrieval and Pre-fetching algorithms for Segmented Streaming in Mobile Peer-to-Peer Networks 移动点对点网络中分段流的集成检索和预取算法
2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-01 DOI: 10.1109/CLUSTR.2005.347094
Zhou Su, J. Katto, Y. Yasuda
{"title":"An integrated Retrieval and Pre-fetching algorithms for Segmented Streaming in Mobile Peer-to-Peer Networks","authors":"Zhou Su, J. Katto, Y. Yasuda","doi":"10.1109/CLUSTR.2005.347094","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347094","url":null,"abstract":"In contrast to conventional P2P systems in wired networks that consist of static peers, mobile P2P are subjected to the limitations of battery power, wireless bandwidth, and the dynamically changed network topology. Challenges arise in how to improve the source discovery and data replication. In this paper, we talk about an integrated searching and prefetching algorithm for the segmented streaming in mobile peer-to-peer (P2P) Networks. Firstly, each stream is divided into several segments and each segment is assigned a priority based on theory analyses. Then, for a given segment, the different number of queries is sent to search it and the length of the query for this segment is also dynamically decided by the segment-priority to avoid the unnecessary overhead. Next, along the path where a stream is sent from the requester node, parts of the nodes on this path are selected to pre-fetch the requested segment to reduce the user delay for the next possible request. Finally, Simulation results show that better performance than the conventional methods can be achieved","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"334 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116125234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Reliability-aware Checkpoint/Restart Scheme: A Performability Trade-off 可靠性感知检查点/重启方案:性能权衡
2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-01 DOI: 10.1109/CLUSTR.2005.347058
Yudan Liu, C. Leangsuksun, Hertong Song, S. Scott
{"title":"Reliability-aware Checkpoint/Restart Scheme: A Performability Trade-off","authors":"Yudan Liu, C. Leangsuksun, Hertong Song, S. Scott","doi":"10.1109/CLUSTR.2005.347058","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347058","url":null,"abstract":"In previous years, large scale clusters have been commonly deployed to solve important grand-challenge scientific problems. In order to reduce computational time, the system size has been increasingly expanded. Unfortunately, the reliability of such cluster systems goes in the opposite direction, as the extension of a system scale. Since failures of a single node could result in a system outage, it is essential to effectively deal with faulty situations in the grand challenge problem-solving environment. Checkpointing is one of common fault tolerance techniques. However, there are many challenges in checkpointing such as overhead, latency and consistency, as well as recovery. In this paper, a reliability-aware checkpoint/restart method was introduced. It is a novel technique to consider checkpointing placement based on system reliability. We constructed a cost model and derived an optimal checkpoint placement function based on failure rates: A trade-off between performance and reliability (i.e. performability) was a key consideration. We also implemented a proof-of-concept and demonstrated improvements resulting from our techniques for fault-tolerant MPI applications on an HA-OSCAR cluster","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131706875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Implementation and Performance of Portals 3.3 on the Cray XT3 门户3.3在Cray XT3上的实现和性能
2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-01 DOI: 10.1109/CLUSTR.2005.347061
R. Brightwell, Trammell Hudson, K. Pedretti, R. Riesen, K. Underwood
{"title":"Implementation and Performance of Portals 3.3 on the Cray XT3","authors":"R. Brightwell, Trammell Hudson, K. Pedretti, R. Riesen, K. Underwood","doi":"10.1109/CLUSTR.2005.347061","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347061","url":null,"abstract":"The Portals data movement interface was developed at Sandia National Laboratories in collaboration with the University of New Mexico over the last ten years. Portals is intended to provide the functionality necessary to scale a distributed memory parallel computing system to thousands of nodes. Previous versions of Portals ran on several large-scale machines, including a 1024-node nCUBE-2, a 1800-node Intel Paragon, and the 4500-node Intel ASCI Red machine. The latest version of Portals was initially developed for an 1800-node Linux/Myrinet cluster and has since been adopted by Cray as the lowest-level network programming interface for their XT3 platform. In this paper, we describe the implementation of Portals 3.3 on the Cray XT3 and present some initial performance results from several micro-benchmark tests. Despite some limitations, the implementation of Portals is able to achieve a zero-length one-way latency of under six microseconds and a uni-directional bandwidth of more than 1.1 GB/s","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114870352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Meaningful Automated Statistical Analysis of Large Computational Clusters 大型计算集群有意义的自动统计分析
2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-01 DOI: 10.1109/CLUSTR.2005.347090
J. Brandt, A. Gentile, Y. Marzouk, P. Pébay
{"title":"Meaningful Automated Statistical Analysis of Large Computational Clusters","authors":"J. Brandt, A. Gentile, Y. Marzouk, P. Pébay","doi":"10.1109/CLUSTR.2005.347090","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347090","url":null,"abstract":"As clusters utilizing commercial off-the-shelf technology have grown from tens to thousands of nodes and typical job sizes have likewise increased, much effort has been devoted to improving the scalability of message-passing fabrics, schedulers, and storage. Largely ignored, however, has been the issue of predicting node failure, which also has a large impact on scalability. In fact, more than ten years into cluster computing, we are still managing this issue on a node-by-node basis even though available diagnostic data has grown immensely. We have built a tool that uses the statistical similarity of the large number of nodes in a cluster to infer the health of each individual node. In the poster, we first present real data and statistical calculations as foundational material and justification for our claims of similarity. Next we present our methodology and its implications for early notification of deviation from normal behavior, problem diagnosis, automatic code restart via interaction with scheduler, and airflow distribution monitoring in the machine room. A framework addressing scalability is discussed briefly. Lastly, we present case studies showing how our methodology has been used to detect aberrant nodes whose deviations are still far below the detection level of traditional methods. A summary of the results of the case studies appears below","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"169 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126231617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device 通过InfiniBand交换到远端内存:一种使用高性能网络块设备的方法
2005 IEEE International Conference on Cluster Computing Pub Date : 2005-09-01 DOI: 10.1109/CLUSTR.2005.347050
Shuang Liang, R. Noronha, D. Panda
{"title":"Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device","authors":"Shuang Liang, R. Noronha, D. Panda","doi":"10.1109/CLUSTR.2005.347050","DOIUrl":"https://doi.org/10.1109/CLUSTR.2005.347050","url":null,"abstract":"Traditionally, operations with memory on other nodes (remote memory) in cluster environments interconnected with technologies like Gigabit Ethernet have been expensive with latencies several magnitudes slower than local memory accesses. Modern RDMA capable networks such as InfiniBand and Quadrics provide low latency of a few microseconds and high bandwidth of up to 10 Gbps. This has significantly reduced the latency gap between access to local memory and remote memory in modern clusters. Remote idle memory can be exploited to reduce the memory pressure on individual nodes. This is akin to adding an additional level in the memory hierarchy between local memory and the disk, with potentially dramatic performance improvements especially for memory intensive applications. In this paper, we take on the challenge to design a remote paging system for remote memory utilization in InfiniBand clusters. We present the design and implementation of a high performance networking block device (HPBD) over InfiniBand fabric, which serves as a swap device of kernel virtual memory (VM) system for efficient page transfer to/from remote memory servers. Our experiments show that using HPBD, quick sort performs only 1.45 times slower than local memory system, and up to 21 times faster than local disk. And our design is completely transparent to user applications. To the best of our knowledge, it is the first work of a remote pager design using InfiniBand for remote memory utilization","PeriodicalId":255312,"journal":{"name":"2005 IEEE International Conference on Cluster Computing","volume":"3 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127466732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 116
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信