[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers最新文献

筛选
英文 中文
Distance agreement protocols 远程协议协议
K. Echtle
{"title":"Distance agreement protocols","authors":"K. Echtle","doi":"10.1109/FTCS.1989.105565","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105565","url":null,"abstract":"A novel class of agreement protocols suitable for replicated nondeterministic processes is introduced. Reduction of message number and early stopping are achieved by taking distance decisions not after, but during protocol execution. Metrical comparison of results is not restricted to numerical applications. Unlike median selection, it covers multidimensional spaces and helps to solve typical problems of distributed systems, e.g., global scheduling, synchronization, sequence agreement, reconfiguration, and elimination of time skew. A so-called pendulum protocol is described in detail.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"261 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127143960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Language constructs for timed atomic commitment 定时原子承诺的语言构造
S. Davidson, Insup Lee, V. Wolfe
{"title":"Language constructs for timed atomic commitment","authors":"S. Davidson, Insup Lee, V. Wolfe","doi":"10.1109/FTCS.1989.105621","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105621","url":null,"abstract":"In a large class of hard-real-time control applications, components execute concurrently on distributed nodes and must coordinate, under timing constraints, to perform the control task. As such, they perform a type of atomic commitment. In traditional atomic commitment there are no timing constraints; agreement is eventual. The authors present a definition of timed atomic commitment (TAC) which requires the processes to be functionally consistent, but allows the outcome to include an exceptional state, indicating that faults have caused timing constraints to be violated. The authors also present a high-level language construct that facilitates the use of TAC in distributed real-time programming and discuss its behavior when faults occur.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"205 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122810314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Control-flow checking using watchdog assists and extended-precision checksums 使用看门狗辅助和扩展精度校验和的控制流检查
N. Saxena, E. McCluskey
{"title":"Control-flow checking using watchdog assists and extended-precision checksums","authors":"N. Saxena, E. McCluskey","doi":"10.1109/FTCS.1989.105615","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105615","url":null,"abstract":"A control-flow checking method is proposed. Extended-precision checksum-based control-flow checking is shown to have low error detection latency compared to previously proposed methods. Analytical measures are derived to demonstrate the effectiveness of using extended-precision checksums for control-flow checking. The error detection latency in the extended-precision checksum-based control-flow checking remains relatively constant for both single and multiple sequence errors. In the case of signature-based methods, error detection latency increases linearly with the number of sequence errors. A watchdog assist architecture for control-flow checking in programs is defined. Unlike previously proposed control-flow checking methods, this watchdog assist architecture is well suited for multiprocessor, multiprogramming, and cache-based environments. The Hewlett-Packard precision architecture is used as an example to demonstrate the feasibility of watchdog assists.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114543289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 104
Fault diagnosis for sparsely interconnected multiprocessor systems 稀疏互连多处理机系统的故障诊断
D. Blough, G. Sullivan, G. Masson
{"title":"Fault diagnosis for sparsely interconnected multiprocessor systems","authors":"D. Blough, G. Sullivan, G. Masson","doi":"10.1109/FTCS.1989.105544","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105544","url":null,"abstract":"The authors present a general approach to fault diagnosis that is widely applicable and requires only a limited number of connections among units. Each unit in the system forms a private opinion on the status of each of its neighboring units based on duplication of jobs and comparison of job results over time. A diagnosis algorithm that consists of simply taking a majority vote among the neighbors of a unit to determine the status of that unit is then executed. The performance of this simple majority-vote diagnosis algorithm is analyzed using a probabilistic model for the faults in the system. It is shown that with high probability, for systems composed of n units, the algorithm will correctly identify the status of all units when each unit is connected to O(log n) other units. It is also shown that the algorithm works with high probability in a class of systems in which the average number of neighbors of a unit is constant. The results indicate that fault diagnosis can in fact be achieved quite simply in multiprocessor systems containing a low to moderate number of testing conditions.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128515953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Performability of a token bus network under transient fault conditions 暂态故障条件下令牌总线网络的性能
J. F. Meyer, K. Muralidhar, W. Sanders
{"title":"Performability of a token bus network under transient fault conditions","authors":"J. F. Meyer, K. Muralidhar, W. Sanders","doi":"10.1109/FTCS.1989.105562","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105562","url":null,"abstract":"The authors present the results of a detailed performability evaluation of a network using the IEEE 802.4 protocol. In particular a 30 station IEEE 802.4 token bus network operating in a hostile factory environment is evaluated using stochastic activity networks. Stochastic activity networks, a generalization of stochastic Petri nets, provide a convenient representation for computer networks and are formal enough to permit solution by both analysis and simulation. The evaluation results show (1) that stochastic activity networks are an appropriate model type for evaluating the performability of local-area networks, and (2) that the protocol is extremely tolerant to transient faults such as token losses and noise bursts under moderate network loads.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123832993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
An analytical model for computing hypercube availability 计算超立方体可用性的解析模型
C. Das, Jong Kim
{"title":"An analytical model for computing hypercube availability","authors":"C. Das, Jong Kim","doi":"10.1109/FTCS.1989.105631","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105631","url":null,"abstract":"An analytical model is presented for computing the availability of an n-dimensional hypercube. The model computes the probability of j connected working nodes in a hypercube by multiplying two probabilistic terms. The first term is the probability of x connected nodes (x>or=j) working out of 2/sup n/ fully connected nodes. This is obtained from the numerical solution of the well-known machine repairman model, modified to capture imperfect coverage and imprecise repair. The second term, which is the probability of having j connected nodes in a hypercube, is computed from an approximate model of the hypercube. The approximate model, in turn, is based on a decomposition principle, where an n-cube connectivity is computed from a two-cube base model using a recursive equation. The availability model studied in this paper is known as task-based availability, where a system remains operational as long as a task can be executed on the system. Analytical results from n-dimensional cubes are given for various task requirements. The model is validated by comparing the analytical results with those from simulation.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127060604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Message routing in HARTS with faulty components 带有故障组件的hart中的消息路由
A. Olson, K. Shin
{"title":"Message routing in HARTS with faulty components","authors":"A. Olson, K. Shin","doi":"10.1109/FTCS.1989.105588","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105588","url":null,"abstract":"The authors develop a routing scheme in two steps for a wrapped hexagonal mesh, called HARTS (hexagonal architecture for real-time systems), which ensures the delivery of every message as long as there is a path between its source and destination. The scheme can also detect the nonexistence of a path between a pair of nodes in a finite amount of time. Moreover, the scheme requires each node in HARTS to know only the state (faulty or not) of each of its own links. The performance of the simple routing scheme is simulated for three- and five-dimensional H-meshes while the physical distribution of faulty components is varied. It is shown that a shortest path between the source and the destination of each message is taken with a high probability, and a path, if one exists, is usually found very quickly.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124438187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Fail-softness evaluation in multiple-bus local computer networks 多总线本地计算机网络的故障软性评价
Vikram V. Karmarkar, J. G. Kuhl
{"title":"Fail-softness evaluation in multiple-bus local computer networks","authors":"Vikram V. Karmarkar, J. G. Kuhl","doi":"10.1109/FTCS.1989.105632","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105632","url":null,"abstract":"A fail-softness evaluation methodology is presented which is suitable for quantifying the graceful degradation characteristics of local computer networks (LCN) using multiple buses. The approach quantifies degradation of performance due to failure over any given application lifetime and also yields a single figure of merit that can be used for comparison of alternative multiple-bus LCN architectures with specific reliability/cost constraints. The analysis technique models both network service failures and configuration-related delay characteristics. Existing notions of performability analysis and bandwidth availability are used in the modeling process to derive a combined performance/reliability measure. The fail-softness analysis is used to compare several alternative multiple-bus architectures, which use different demand-assignment multiple-access (DAMA) methods. A class of integrated access methodologies that use a single shared token to arbitrate access to all buses is shown to exhibit generally superior performance/reliability characteristics as compared to other alternatives, such as those which use an independent DAMA protocol for each bus.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115031054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Understanding large system failures-a fault injection experiment 理解大型系统故障—故障注入实验
R. Chillarege, N. Bowen
{"title":"Understanding large system failures-a fault injection experiment","authors":"R. Chillarege, N. Bowen","doi":"10.1109/FTCS.1989.105592","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105592","url":null,"abstract":"Fault injection is used to characterize large system failures. Thus, it overcomes limitations imposed by the lack of complete information in field failure data. The experiment is conducted on a commercial transaction processing system. The authors: (1) introduce the idea of failure acceleration to conduct such experiments; (2) estimate total loss of the primary service to occur in only 16% of the faults; (3) reveal errors termed potential hazards that do not affect short-term availability but cause a catastrophic failure following a change in operating state; and (4) identify at least 41% of errors as potential candidates for repair before total failure. The results enhance the understanding of large system failures and provide a foundation for design enhancements and modeling of availability.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129462518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 191
A system for supporting multi-language versions for software fault tolerance 一个支持多语言版本的软件容错系统
James M. Purtilo, P. Jalote
{"title":"A system for supporting multi-language versions for software fault tolerance","authors":"James M. Purtilo, P. Jalote","doi":"10.1109/FTCS.1989.105578","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105578","url":null,"abstract":"A description is given of a system that allows versions to be coded in different programming languages. The system supports both the recovery block scheme and the N-version programming method. It permits fault tolerance to be used for specified modules that could be embedded in a larger program. The system also allows the different versions to be executed on different machines. It has been implemented in C on DEC Vaxes and Sun 3 workstations and operates in a network of Unix-based machines.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129230879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信