容灾计算的组成与分析

Chad M. Lawler, Michael A. Harper, Mitchell A. Thornton
{"title":"容灾计算的组成与分析","authors":"Chad M. Lawler, Michael A. Harper, Mitchell A. Thornton","doi":"10.1109/PCCC.2007.358917","DOIUrl":null,"url":null,"abstract":"This paper provides a review of the components of disaster tolerant computing and communications and reviews the current state in light of recent man-made terrorist events. The paper examines the relationships between disaster tolerant systems, information technology (IT) application availability and executive level management visibility necessary for successful system operations in the event of a catastrophic disaster; one which causes rapid, almost simultaneous, multiple points of failure in a system, as well as a single points of failure that escalate into wide catastrophic system failures. The technology, process and human resource challenges of traditional disaster recovery approaches to disaster preparedness are outlined. The risks of IT application downtime attributable to the increasing dependence on critical information technology applications operating in distributed and unbounded networks are explored. A general method for disaster tolerance is proposed which mitigates unplanned downtime through a disciplined approach of IT infrastructure design based on redundancy and distributed components with special attention given to the ability of executive level management to comprehend the value of uptime of an application and make appropriate capital investment. The importance of executive visibility into the system wide impact of downtime and the resultant effects on the costs of downtime of critical systems is explored.","PeriodicalId":356565,"journal":{"name":"2007 IEEE International Performance, Computing, and Communications Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Components and Analysis of Disaster Tolerant Computing\",\"authors\":\"Chad M. Lawler, Michael A. Harper, Mitchell A. Thornton\",\"doi\":\"10.1109/PCCC.2007.358917\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper provides a review of the components of disaster tolerant computing and communications and reviews the current state in light of recent man-made terrorist events. The paper examines the relationships between disaster tolerant systems, information technology (IT) application availability and executive level management visibility necessary for successful system operations in the event of a catastrophic disaster; one which causes rapid, almost simultaneous, multiple points of failure in a system, as well as a single points of failure that escalate into wide catastrophic system failures. The technology, process and human resource challenges of traditional disaster recovery approaches to disaster preparedness are outlined. The risks of IT application downtime attributable to the increasing dependence on critical information technology applications operating in distributed and unbounded networks are explored. A general method for disaster tolerance is proposed which mitigates unplanned downtime through a disciplined approach of IT infrastructure design based on redundancy and distributed components with special attention given to the ability of executive level management to comprehend the value of uptime of an application and make appropriate capital investment. The importance of executive visibility into the system wide impact of downtime and the resultant effects on the costs of downtime of critical systems is explored.\",\"PeriodicalId\":356565,\"journal\":{\"name\":\"2007 IEEE International Performance, Computing, and Communications Conference\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE International Performance, Computing, and Communications Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PCCC.2007.358917\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE International Performance, Computing, and Communications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PCCC.2007.358917","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

本文回顾了容灾计算和通信的组成部分,并根据最近的人为恐怖事件回顾了目前的状态。本文研究了在发生灾难性灾难时,成功的系统操作所必需的容灾系统、信息技术(IT)应用程序可用性和执行层管理可见性之间的关系;一种导致系统中快速,几乎同时的多点故障,以及单点故障升级为广泛的灾难性系统故障。概述了传统灾难恢复方法在备灾方面的技术、流程和人力资源挑战。IT应用程序停机的风险归因于日益依赖的关键信息技术应用程序运行在分布式和无界的网络进行了探讨。提出了一种通用的容灾方法,该方法通过基于冗余和分布式组件的IT基础设施设计的规范方法来减少计划外停机时间,并特别注意执行层管理理解应用程序正常运行时间价值的能力,并进行适当的资本投资。管理层对停机对整个系统的影响以及由此产生的对关键系统停机成本的影响的可见性的重要性进行了探讨。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Components and Analysis of Disaster Tolerant Computing
This paper provides a review of the components of disaster tolerant computing and communications and reviews the current state in light of recent man-made terrorist events. The paper examines the relationships between disaster tolerant systems, information technology (IT) application availability and executive level management visibility necessary for successful system operations in the event of a catastrophic disaster; one which causes rapid, almost simultaneous, multiple points of failure in a system, as well as a single points of failure that escalate into wide catastrophic system failures. The technology, process and human resource challenges of traditional disaster recovery approaches to disaster preparedness are outlined. The risks of IT application downtime attributable to the increasing dependence on critical information technology applications operating in distributed and unbounded networks are explored. A general method for disaster tolerance is proposed which mitigates unplanned downtime through a disciplined approach of IT infrastructure design based on redundancy and distributed components with special attention given to the ability of executive level management to comprehend the value of uptime of an application and make appropriate capital investment. The importance of executive visibility into the system wide impact of downtime and the resultant effects on the costs of downtime of critical systems is explored.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信