10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.最新文献

筛选
英文 中文
A dynamic checkpointing scheme based on reinforcement learning 一种基于强化学习的动态检查点方案
H. Okamura, Y. Nishimura, T. Dohi
{"title":"A dynamic checkpointing scheme based on reinforcement learning","authors":"H. Okamura, Y. Nishimura, T. Dohi","doi":"10.1109/PRDC.2004.1276566","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276566","url":null,"abstract":"We develop a new checkpointing scheme for a uniprocess application. First, we model the checkpointing scheme by a semiMarkov decision process, and apply the reinforcement learning algorithm to estimate statistically the optimal checkpointing policy. More specifically, the representative reinforcement learning algorithm, called the Q-learning algorithm, is used to develop an adaptive checkpointing scheme. In simulation experiments, we examine the asymptotic behavior of the system overhead with adaptive checkpointing and show quantitatively that the proposed dynamic checkpoint algorithm is useful and robust under an incomplete knowledge on the failure time distribution.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126452983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Safety testing of safety critical software based on critical mission duration 基于关键任务持续时间的安全关键软件安全测试
Shiping Yang, Nan Sang, Guang-ze Xiong
{"title":"Safety testing of safety critical software based on critical mission duration","authors":"Shiping Yang, Nan Sang, Guang-ze Xiong","doi":"10.1109/PRDC.2004.1276557","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276557","url":null,"abstract":"To assess the safety of software based safety critical systems, we firstly analyzed the differences between reliability and safety, then, introduced a safety model based on three-state Markov model and some safety-related metrics. For safety critical software it is common to demand that all known faults are removed. Thus an operational test for safety critical software takes the form of a specified number of test cases (or a specified critical mission duration) that must be executed unsafe-failure-free. When the previous test has been early terminated as a result of an unsafe failure, it has been proposed that the further test need to be more stringent (i.e. the number of tests that must be executed unsafe-failure-free should increase). In order to solve the problem, a safety testing method based on critical mission duration and Bayesian testing stopping rules is proposed.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127373657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
On the effects of partial membership knowledge on reliability of gossip-based multicast 部分隶属知识对基于流言的组播可靠性的影响
Tatsuhiro Tsuchiya, T. Kikuno
{"title":"On the effects of partial membership knowledge on reliability of gossip-based multicast","authors":"Tatsuhiro Tsuchiya, T. Kikuno","doi":"10.1109/PRDC.2004.1276555","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276555","url":null,"abstract":"Gossip-based multicast schemes have attracted increasing interest, because they are easy to deploy and resilient to failures. However, traditional gossip-based protocols rely on each process having knowledge of the global membership, thus limiting their scalability. To overcome this problem several protocols have been developed that can operate with processes having only a partial view of the global membership. We discuss the effects of partial views on the reliability of gossip-based multicast protocols. Specifically, we identify three desirable properties for views and show constructions of views satisfying these properties. Numerical results obtained show that reliability can be considerably affected by views adopted, especially in the presence of faulty processes.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130648014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
WINDAR: a multithreaded rollback-recovery toolkit on windows WINDAR: windows上的多线程回滚恢复工具包
Jinmin Yang, Dafang Zhang, Zheng Qin, X. Yang
{"title":"WINDAR: a multithreaded rollback-recovery toolkit on windows","authors":"Jinmin Yang, Dafang Zhang, Zheng Qin, X. Yang","doi":"10.1109/PRDC.2004.1276596","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276596","url":null,"abstract":"We describe the design and implementation of WINDAR, an object-oriented toolkit for transparent rollback-recovery of distributed applications running on Windows platform. In WINDAR, the workloads of a process are multithreaded, exploiting effectively processor execution resources to improve execution efficiency. In addition, WINDAR's unified framework for various rollback recovery protocols enables dynamic protocol configuration to adapt itself to the need of recovery-oriented computing (ROC) and distributed computations in Internet environment. WINDAR was evaluated using three benchmarks. It is observed that multithreading is an effective approach to improve the performance of message logging protocols, especially for pessimistic message logging. In our experiment, the overhead ratio of pessimistic message logging was reduced to the same magnitude as that of the optimistic message logging for three benchmarks.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125623126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Benchmarking operating system dependability: Windows 2000 as a case study 对操作系统可靠性进行基准测试:以Windows 2000为例
A. Kalakech, T. Jarboui, J. Arlat, Y. Crouzet, K. Kanoun
{"title":"Benchmarking operating system dependability: Windows 2000 as a case study","authors":"A. Kalakech, T. Jarboui, J. Arlat, Y. Crouzet, K. Kanoun","doi":"10.1109/PRDC.2004.1276576","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276576","url":null,"abstract":"We propose a dependability benchmark suitable for a general purpose operating system (OS). The specifications of the benchmark components are presented and illustrated on a benchmark prototype dedicated to Windows 2000. The important novelty, as regards OS dependability benchmarking, is threefold. First, it lies on a comprehensive and structured set of measures: outcomes are considered both at the OS level and at the application level. Second, these measures include not only robustness measures (e.g., the distribution among the observed outcomes for the OS and the application: error codes, exceptions, workload correct or erroneous completion, OS and application hang), but also the related temporal measures in the presence of faults (e.g., system call and workload execution times, as well as operating system restart time). Finally, we are considering a realistic workload (namely, TPC-C client), instead of a synthetic workload.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131560212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Representing user workarounds as a component of system dependability 将用户变通方法表示为系统可靠性的一个组成部分
Christopher Martin, P. Koopman
{"title":"Representing user workarounds as a component of system dependability","authors":"Christopher Martin, P. Koopman","doi":"10.1109/PRDC.2004.1276591","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276591","url":null,"abstract":"Evaluation of system-level dependability can benefit from representing and assessing the effects of user workarounds as a response to system component failures. We assemble sequence diagrams that represent UML scenarios into mission graphs that contain all possible paths from a particular mission starting point to a particular mission success goal point. Analysis of these graphs reveals potential dependability bottlenecks and the existence of possible workarounds that can be intentionally added to a design, retrofitted to fit an existing design, or discovered as an emergent property of existing system and user behaviors. Simulations of a moderately complex distributed embedded system demonstrate that this approach has potential benefits for representing and improving system-level dependability by including the ability of users to perform simple workarounds to achieve mission objectives.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122186969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
The system recovery benchmark 系统恢复基准
James Mauro, Ji Zhu, I. Pramanick
{"title":"The system recovery benchmark","authors":"James Mauro, Ji Zhu, I. Pramanick","doi":"10.1109/PRDC.2004.1276577","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276577","url":null,"abstract":"We describe a benchmark for measuring system recovery on a nonclustered standalone system. A system's ability to recover from an outage quickly is a critical factor in overall system availability. General purpose computer systems, such as UNIX based systems, tend to execute the same sequence or series of steps during system startup and outage recovery. Our experience has shown that these steps are consistent, reproducible and measurable, and can thus be benchmarked. Additionally, the factors that create variability in restart/recovery can be bound and represented in a meaningful way. A defined set of measurements, coupled with a specification for representing the results and system variables, provide the foundation for system recovery benchmarking.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"98 22","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113944100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
RedCAN/sup TM/: simulations of two fault recovery algorithms for CAN RedCAN/sup TM/:两种CAN故障恢复算法的仿真
H. Sivencrona, T. Olssøn, R. Johansson, J. Torin
{"title":"RedCAN/sup TM/: simulations of two fault recovery algorithms for CAN","authors":"H. Sivencrona, T. Olssøn, R. Johansson, J. Torin","doi":"10.1109/PRDC.2004.1276580","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276580","url":null,"abstract":"We present the RedCAN concept to achieve fault tolerance against node and link failures in a CAN-bus system by means of configurable switches. The basic idea in RedCAN is to isolate faulty nodes or bus segments by configuring switches that will evade a faulty node or segment and exclude it from bus access. We propose changes to the original centralized protocol, vulnerable to single point failures, and show that with a new distributed algorithm considerable more efficiency can be achieved also when network size is growing. The distributed algorithm introduces redundancy and hereby increases robustness of the system. Furthermore, the new algorithm has logarithmic complexity, as opposed to the centralized algorithms linear complexity, as the number of nodes increase. The results were gathered through a new simulator, the \"RedCAN Simulation Manager\", also presented. Simulations allow assessing the break-even point between centralized and distributed algorithms reconfiguration latencies as well as give ideas for further research.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125630000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Optimal allocation of testing-resource considering cost, reliability, and testing-effort 考虑成本、可靠性和测试工作的测试资源的最优分配
Chin-Yu Huang, J. Lo, S. Kuo, Michael R. Lyu
{"title":"Optimal allocation of testing-resource considering cost, reliability, and testing-effort","authors":"Chin-Yu Huang, J. Lo, S. Kuo, Michael R. Lyu","doi":"10.1109/PRDC.2004.1276561","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276561","url":null,"abstract":"We investigate an optimal resource allocation problem in modular software systems during testing phase. The main purpose is to minimize the cost of software development when the number of remaining faults and a desired reliability objective are given. An elaborated optimization algorithm based on the Lagrange multiplier method is proposed and numerical examples are illustrated. Besides, sensitivity analysis is also conducted. We analyze the sensitivity of parameters of proposed software reliability growth models and show the results in detail. In addition, we present the impact on the resource allocation problem if some parameters are either overestimated or underestimated. We can evaluate the optimal resource allocation problems for various conditions by examining the behavior of the parameters with the most significant influence. The experimental results greatly help us to identify the contributions of each selected parameter and its weight. The proposed algorithm and method can facilitate the allocation of limited testing-resource efficiently and thus the desired reliability objective during software module testing can be better achieved.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127892729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 50
Stochastic Petri nets and inheritance for dependability modelling 可靠性建模的随机Petri网与遗传
Simona Bernardi, S. Donatelli
{"title":"Stochastic Petri nets and inheritance for dependability modelling","authors":"Simona Bernardi, S. Donatelli","doi":"10.1109/PRDC.2004.1276592","DOIUrl":"https://doi.org/10.1109/PRDC.2004.1276592","url":null,"abstract":"Reuse is a well-known and widely accepted principle in design and programming, that is instantiated through two main means: modularity and inheritance. Modularity allows a function or a data type and associated functions to be reused, while inheritance is based on the idea that a set of common features of a type can be factorized into a common supertype. While modularity has been widely exploited in performance and dependability modelling, inheritance is instead pretty much a \"still-to-investigate\" topic for this field. We discuss the role of inheritance in stochastic Petri nets (SPN) modelling, by considering a representation of the fault, error, and failure (FEF) chain based on hierarchies of classes (in the class diagram formalism of UML) and corresponding hierarchies of SPN models.","PeriodicalId":383639,"journal":{"name":"10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings.","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126541884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信