Disaster tolerant Wolfpack geo-clusters

Richard S. Wilkins, Xing Du, Robert A. Cochran, M. Popp
{"title":"Disaster tolerant Wolfpack geo-clusters","authors":"Richard S. Wilkins, Xing Du, Robert A. Cochran, M. Popp","doi":"10.1109/CLUSTR.2002.1137750","DOIUrl":null,"url":null,"abstract":"Clustering of computer systems to increase application availability has become a common industry practice. While it does increase the availability of applications and their data to users, it does not solve the problem of a disaster (flood, tornado, earthquake, terrorism, civil unrest, etc.) making the entire cluster, and the applications and data it is serving, unavailable. Distance mirroring of an application's data store allows for recovery from disaster but may still result in long periods of unacceptable downtime. This paper describes a method for stretching a standard Wolfpack (Microsoft/sup /spl trade// Cluster Service, MSCS) cluster of Intel architecture servers geographically for disaster tolerance. Server nodes and their storage may be placed at two (or more) distant sites to prevent a single disaster from taking down the entire cluster. Standard cluster semantics and ease of use are maintained using the remote mirroring capabilities of Hewlett-Packard's high-end storage arrays. The design of additional software to control data mirroring behavior when moving or failing-over applications between server nodes is described. Also, software that allows \"stretching\" the cluster quorum disk between sites in a manner that is transparent to the cluster software and also software for an external arbitrator node that provides rapid recovery from total loss of inter-site communications is described. Flexibility provided by the array's firmware mirroring options (i.e. synchronous or asynchronous I/O mirroring) allows for optimum use of inter-site link bandwidth based on the data safety requirements of individual applications.","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2002-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTR.2002.1137750","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Clustering of computer systems to increase application availability has become a common industry practice. While it does increase the availability of applications and their data to users, it does not solve the problem of a disaster (flood, tornado, earthquake, terrorism, civil unrest, etc.) making the entire cluster, and the applications and data it is serving, unavailable. Distance mirroring of an application's data store allows for recovery from disaster but may still result in long periods of unacceptable downtime. This paper describes a method for stretching a standard Wolfpack (Microsoft/sup /spl trade// Cluster Service, MSCS) cluster of Intel architecture servers geographically for disaster tolerance. Server nodes and their storage may be placed at two (or more) distant sites to prevent a single disaster from taking down the entire cluster. Standard cluster semantics and ease of use are maintained using the remote mirroring capabilities of Hewlett-Packard's high-end storage arrays. The design of additional software to control data mirroring behavior when moving or failing-over applications between server nodes is described. Also, software that allows "stretching" the cluster quorum disk between sites in a manner that is transparent to the cluster software and also software for an external arbitrator node that provides rapid recovery from total loss of inter-site communications is described. Flexibility provided by the array's firmware mirroring options (i.e. synchronous or asynchronous I/O mirroring) allows for optimum use of inter-site link bandwidth based on the data safety requirements of individual applications.
容灾狼群地理集群
计算机系统集群以提高应用程序可用性已成为一种常见的行业实践。虽然它确实增加了应用程序及其数据对用户的可用性,但它并不能解决灾难(洪水、龙卷风、地震、恐怖主义、内乱等)导致整个集群及其所服务的应用程序和数据不可用的问题。应用程序数据存储的远程镜像允许从灾难中恢复,但仍可能导致不可接受的长时间停机。本文描述了一种在地理上扩展Intel架构服务器的标准Wolfpack (Microsoft/sup /spl trade// Cluster Service, MSCS)集群以实现容灾的方法。服务器节点及其存储可以放置在两个(或更多)遥远的站点上,以防止单个灾难导致整个集群崩溃。使用惠普高端存储阵列的远程镜像功能来维护标准集群语义和易用性。描述了在服务器节点之间移动或故障转移应用程序时控制数据镜像行为的附加软件的设计。此外,还描述了允许以对集群软件透明的方式在站点之间“拉伸”集群仲裁磁盘的软件,以及用于从站点间通信的完全丢失中提供快速恢复的外部仲裁节点的软件。阵列的固件镜像选项(即同步或异步I/O镜像)提供的灵活性允许基于单个应用程序的数据安全要求最佳地使用站点间链路带宽。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信