Proceedings. 14th Symposium on Reliable Distributed Systems最新文献_第2页

Self diagnosis of processor arrays using a comparison model 使用比较模型的处理器阵列的自诊断

Proceedings. 14th Symposium on Reliable Distributed Systems Pub Date : 1995-09-13 DOI: 10.1109/RELDIS.1995.526229

P. Maestrini, P. Santi

引用次数: 25

Membership and system diagnosis 成员和系统诊断

Proceedings. 14th Symposium on Reliable Distributed Systems Pub Date : 1995-09-13 DOI: 10.1109/RELDIS.1995.526228

M. Hiltunen

引用次数: 26

System support for robust collaborative applications 对健壮的协作应用程序的系统支持

Proceedings. 14th Symposium on Reliable Distributed Systems Pub Date : 1995-09-13 DOI: 10.1109/RELDIS.1995.526214

M. Chelliah, M. Ahamad

{"title":"System support for robust collaborative applications","authors":"M. Chelliah, M. Ahamad","doi":"10.1109/RELDIS.1995.526214","DOIUrl":"https://doi.org/10.1109/RELDIS.1995.526214","url":null,"abstract":"Traditional transaction models ensure robustness for distributed applications through the properties of view and failure atomicity. It has generally been felt that such atomicity properties are restrictive for a wide range of application domains; this is particularly true for robust, collaborative applications because such applications have concurrent components that are inherently long-lived and that cooperate. Recent advances in extended transaction models can be exploited to structure long-lived and cooperative computations. Applications can use a combination of such models to achieve the desired degree of robustness; hence, we develop a system which can support a number of flexible transaction models, with correctness criteria that extend or relax serializability. We analyze two concrete CSCW applications-collaborative editor and meeting scheduler. We show how a combination of two extended transaction models, that promote split and cooperating actions, facilitates robust implementations of these collaborative applications. Thus, we conclude that a system that implements multiple transaction models provides flexible support for building robust collaborative applications.","PeriodicalId":275219,"journal":{"name":"Proceedings. 14th Symposium on Reliable Distributed Systems","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130461469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TMR processing without explicit clock synchronisation 没有显式时钟同步的TMR处理

Proceedings. 14th Symposium on Reliable Distributed Systems Pub Date : 1995-09-13 DOI: 10.1109/RELDIS.1995.526226

F. Brasileiro, P. Ezhilchelvan, N. Speirs

引用次数: 2

A method for the construction and interpretation of high level models for distributed fault-tolerant systems 分布式容错系统高层模型的构建和解释方法

Proceedings. 14th Symposium on Reliable Distributed Systems Pub Date : 1995-09-13 DOI: 10.1109/RELDIS.1995.526215

K. Tilly, István Kiss, G. Román, T. Dobrowiecki, A. Várkonyi-Kóczy

{"title":"A method for the construction and interpretation of high level models for distributed fault-tolerant systems","authors":"K. Tilly, István Kiss, G. Román, T. Dobrowiecki, A. Várkonyi-Kóczy","doi":"10.1109/RELDIS.1995.526215","DOIUrl":"https://doi.org/10.1109/RELDIS.1995.526215","url":null,"abstract":"Traditional solutions for achieving fault-tolerance are intended for use at design time and they generally capture system information at a very low (hardware or machine instruction) level. Increasing reliability of complex information systems containing many (perhaps many thousands) of autonomous components requires different solutions. This article presents a new methodology for the implementation of large scale, distributed fault-tolerant systems. System models are formed of objects describing requirements, services and resources organized into high level top-down hierarchical decomposition structures. Since redundancy is a natural property of any large scale system, by using such models it is possible to achieve fault tolerant behaviour by finding multiple appropriate mappings between requirements and available services, and to support the required services by available resources. The distributed system is extended with dedicated components, called diagnostic centres, which manage distinct parts of the system model, continuously observe the operation of the distributed system, and find alternative requirement-service mappings, if some services fail to fulfil their associated requirements. The elements and the structure of the proposed system modelling method are presented, an appropriate fault model is defined, and the algorithms for model interpretation are described.","PeriodicalId":275219,"journal":{"name":"Proceedings. 14th Symposium on Reliable Distributed Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127160731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Non blocking atomic commitment with an unreliable failure detector 具有不可靠故障检测器的非阻塞原子提交

Proceedings. 14th Symposium on Reliable Distributed Systems Pub Date : 1995-09-13 DOI: 10.1109/RELDIS.1995.518722

R. Guerraoui, M. Larrea, A. Schiper

引用次数: 60

Experimental evaluation of the impact of processor faults on parallel applications 处理器故障对并行应用影响的实验评估

Proceedings. 14th Symposium on Reliable Distributed Systems Pub Date : 1995-09-13 DOI: 10.1109/RELDIS.1995.518719

D. Costa, F. Moreira, H. Madeira, M. Z. Rela, J. G. Silva

{"title":"Experimental evaluation of the impact of processor faults on parallel applications","authors":"D. Costa, F. Moreira, H. Madeira, M. Z. Rela, J. G. Silva","doi":"10.1109/RELDIS.1995.518719","DOIUrl":"https://doi.org/10.1109/RELDIS.1995.518719","url":null,"abstract":"This paper addresses the problem of processor faults in distributed memory parallel systems. It shows that transient faults injected at the processor pins of one node of a commercial parallel computer, without any particular fault-tolerant techniques, can cause erroneous application results for up to 43% of the injected faults (depending on the application). In addition to these very subtle faults, up to 19% of the injected faults (almost independent on the application) caused the system to hang up. These results show that fault-tolerant techniques are absolutely required in parallel systems, not only to ensure the completion of long-run applications but, and more important, to achieve confidence in the application results. The benefits of including some fairly simple behaviour based error detection mechanisms in the system were evaluated together with Algorithm Based Fault Tolerance (ABFT) techniques. The inclusion of such Mechanisms in parallel systems seems to be very important for detecting most of those subtle errors without greatly affecting the performance and the cost of these systems.","PeriodicalId":275219,"journal":{"name":"Proceedings. 14th Symposium on Reliable Distributed Systems","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133725197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

A synchronization strategy for a time-triggered multicluster real-time system 时间触发多集群实时系统的同步策略

Proceedings. 14th Symposium on Reliable Distributed Systems Pub Date : 1995-09-13 DOI: 10.1109/RELDIS.1995.526223

H. Kopetz, A. Krüger, D. Millinger, A. Schedl

引用次数: 24

Configurable highly available distributed services 可配置的高可用分布式服务

Proceedings. 14th Symposium on Reliable Distributed Systems Pub Date : 1995-09-13 DOI: 10.1109/RELDIS.1995.526219

C. Karamanolis, J. Magee

引用次数: 14

Designing masking fault-tolerance via nonmasking fault-tolerance 通过非掩模容错设计掩模容错

Proceedings. 14th Symposium on Reliable Distributed Systems Pub Date : 1995-09-13 DOI: 10.1109/RELDIS.1995.526225

A. Arora, S. Kulkarni

引用次数: 70