Estimating System Availability And Reliability

A. Goyal
{"title":"Estimating System Availability And Reliability","authors":"A. Goyal","doi":"10.1109/WSC.1989.718688","DOIUrl":null,"url":null,"abstract":"This paper deals with methods for constructing and solving large Markov chain models of computer system availability and reliability. A set of powerful high level modeling constructs is discussed that can be used to represent the failure and repair behavior of the components interactions. If time independent failure and repair rates are assumed then a time homogeneous continuous time Markov chain can be constructed automatically from the modeling constructs used to decribe the system. Since, the size of Markov chains grows exponentially with the number of components modeled, simulation appears to be a practical way for solving models of large systems. However, the standard simulation takes very long simulation runs to estimate availability and reliability measures because the system failure event is a rare event. Therefore, variance reduction techniques which can aid in computing rare-event probabilities quickly are of interest. Specifically, the Importance Sampling technique has been found to be most useful. The modeling language and the simulation methods discussed in this paper have been implemented in a program package called the System Availability Estimator (SAVE).","PeriodicalId":319104,"journal":{"name":"1989 Winter Simulation Conference Proceedings","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1989-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"1989 Winter Simulation Conference Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WSC.1989.718688","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper deals with methods for constructing and solving large Markov chain models of computer system availability and reliability. A set of powerful high level modeling constructs is discussed that can be used to represent the failure and repair behavior of the components interactions. If time independent failure and repair rates are assumed then a time homogeneous continuous time Markov chain can be constructed automatically from the modeling constructs used to decribe the system. Since, the size of Markov chains grows exponentially with the number of components modeled, simulation appears to be a practical way for solving models of large systems. However, the standard simulation takes very long simulation runs to estimate availability and reliability measures because the system failure event is a rare event. Therefore, variance reduction techniques which can aid in computing rare-event probabilities quickly are of interest. Specifically, the Importance Sampling technique has been found to be most useful. The modeling language and the simulation methods discussed in this paper have been implemented in a program package called the System Availability Estimator (SAVE).
评估系统的可用性和可靠性
本文讨论了计算机系统可用性和可靠性的大型马尔可夫链模型的构造和求解方法。讨论了一组功能强大的高级建模构造,可用于表示组件交互的故障和修复行为。如果假设故障和修复率与时间无关,则可以从用于描述系统的建模构造中自动构造时间齐次连续时间马尔可夫链。由于马尔可夫链的大小随着建模组件的数量呈指数增长,模拟似乎是解决大型系统模型的实用方法。然而,标准模拟需要很长的模拟运行时间来估计可用性和可靠性度量,因为系统故障事件是罕见的事件。因此,能够帮助快速计算罕见事件概率的方差缩减技术引起了人们的兴趣。具体来说,重要性抽样技术已经被发现是最有用的。本文讨论的建模语言和仿真方法已经在一个名为系统可用性估计器(SAVE)的程序包中实现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信