Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281)最新文献_第6页

An integration of the primary-shadow TMO replication scheme with a supervisor-based network surveillance scheme and its recovery time bound analysis 主阴影TMO复制方案与基于监控器的网络监控方案的集成及其恢复时限分析

Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281) Pub Date : 1998-10-20 DOI: 10.1109/RELDIS.1998.740490

K. Kim, C. Subbaraman

{"title":"An integration of the primary-shadow TMO replication scheme with a supervisor-based network surveillance scheme and its recovery time bound analysis","authors":"K. Kim, C. Subbaraman","doi":"10.1109/RELDIS.1998.740490","DOIUrl":"https://doi.org/10.1109/RELDIS.1998.740490","url":null,"abstract":"The time-triggered message-triggered object (TMO) scheme was formulated a few years ago (K.H. Kim et al., 1994; K.H. Kim and C. Subbaraman, 1997), as a major extension of the conventional object structuring schemes with the idealistic goal of facilitating general form design and timeliness-guaranteed design of complex real time application systems. Recently, as a new scheme for realizing TMO-structured distributed and parallel computer systems capable of both hardware and software fault tolerance, we have formulated and demonstrated the primary-shadow TMO replication (PSTR) scheme. An important new extension of the PSTR scheme is an integration of the PSTR scheme and a network surveillance (NS) scheme. This extension results in a significant improvement in the fault coverage and recovery time bound achieved. The NS scheme adopted is a recently developed scheme, effective in a wide range of point-to-point networks and it is called the supervisor based NS (SNS) scheme. The integration of the PSTR scheme and the SNS scheme is called the PSTR/SNS scheme. The recovery time bound of the PSTR/SNS scheme is analyzed on the basis of an implementation model that can be easily adapted to various commercial operating system kernels.","PeriodicalId":376253,"journal":{"name":"Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116826034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Secure and scalable replication in Phalanx 在密集阵中安全可扩展的复制

Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281) Pub Date : 1998-10-20 DOI: 10.1109/RELDIS.1998.740474

D. Malkhi, M. Reiter

引用次数: 163

Security in the large: is Java's sandbox scalable? 安全性:Java的沙箱是可伸缩的吗?

Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281) Pub Date : 1998-10-20 DOI: 10.1109/RELDIS.1998.740528

Qun Zhong, Nigel Edwards

引用次数: 4

A fragmentation scheme for multimedia traffic in active networks 一种在活动网络中用于多媒体流量的分片方案

Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281) Pub Date : 1998-10-20 DOI: 10.1109/RELDIS.1998.740537

S. Wang, B. Bhargava

引用次数: 7

End to end reliable multicast transport protocol requirements for collaborative multimedia systems 协同多媒体系统的端到端可靠组播传输协议要求

Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281) Pub Date : 1998-10-20 DOI: 10.1109/RELDIS.1998.740535

Nadia Kausar, J. Crowcroft

{"title":"End to end reliable multicast transport protocol requirements for collaborative multimedia systems","authors":"Nadia Kausar, J. Crowcroft","doi":"10.1109/RELDIS.1998.740535","DOIUrl":"https://doi.org/10.1109/RELDIS.1998.740535","url":null,"abstract":"Multi-party collaborative multimedia applications require data to be transmitted reliably and efficiently in order to provide a guaranteed quality of service (QoS). The multimedia applications can vary from distributed games and shared whiteboards to interactive video conferencing. These applications often involve a large number of participants and are interactive in nature, with participants dynamically joining and leaving the applications. In order to provide many-to-many interaction when the number of participants is large, IP multicasting is a very good option for communication. IP multicasting provides scalability and efficient routing but does not provide the reliability that these multimedia applications may require. Though a lot of research has been done on reliable multicast transport protocols, it really seems that the only way of doing a reliable multicast is to build it for a given purpose like conference control in multimedia conferencing. This paper compares some of the available multicast transport protocols and analyses the most suitable features and functionalities provided by these protocols for a facet of conference control: floor control. The goal is to find or design a reliable multicast transport protocol which would scale to tens or hundreds of participants scattered across the Internet and which would deliver the control messages reliably.","PeriodicalId":376253,"journal":{"name":"Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117287888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Failure handling in an optimized two-safe approach to maintaining primary-backup systems 以优化的双安全方法处理故障以维护主备系统

Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281) Pub Date : 1998-10-20 DOI: 10.1109/RELDIS.1998.740488

Kexiang Hu, S. Mehrotra, S. Kaplan

{"title":"Failure handling in an optimized two-safe approach to maintaining primary-backup systems","authors":"Kexiang Hu, S. Mehrotra, S. Kaplan","doi":"10.1109/RELDIS.1998.740488","DOIUrl":"https://doi.org/10.1109/RELDIS.1998.740488","url":null,"abstract":"In a primary backup database system, transaction processing takes place at the primary and the log records generated are propagated to the backup which uses them to reconstruct the database state at the primary. If the primary fails, the backup takes over to provide continued service. Most existing designs of primary backup database systems have concentrated on techniques to tolerate complete failures in which the entire primary fails, say due to a disaster. In multiprocessor environments, where the primary and the backup databases are partitioned across multiple computers, a more common case is a partial failure in which some database partitions fail but the system as a whole survives. Existing approaches either ignore partial failures, or require the failed database partition to be unavailable. We explore a design of the primary backup database system that uses the backup not only for disaster protection, but also for continued availability during partial failures. The approach is developed in the context of the improved optimized 2-safe strategy to transmitting logs from the primary to the backup, introduced by K. Hu et al. (1997), which combines the best features of the previously developed 1-safe and 2-safe strategies.","PeriodicalId":376253,"journal":{"name":"Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125575601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Off-line diagnosis of parallel systems 并联系统的离线诊断

Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281) Pub Date : 1998-10-20 DOI: 10.1109/RELDIS.1998.740524

O. Benkahla, C. Robach

引用次数: 0

Semi-passive replication 半被动复制

Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281) Pub Date : 1998-10-20 DOI: 10.1109/RELDIS.1998.740473

X. Défago, A. Schiper, N. Sergent

{"title":"Semi-passive replication","authors":"X. Défago, A. Schiper, N. Sergent","doi":"10.1109/RELDIS.1998.740473","DOIUrl":"https://doi.org/10.1109/RELDIS.1998.740473","url":null,"abstract":"This paper presents the semi-passive replication technique, a variant of passive replication, that can be implemented in the asynchronous system model without requiring a membership service to agree on a primary. Passive replication is a popular replication technique since it can tolerate non-deterministic servers (e.g., multi-threaded servers) and uses little processing power when compared to other replication techniques. However, passive replication suffers from a high reconfiguration cost in case of the failure of the primary. The semi-passive replication technique presented in the paper benefits from the same advantages as passive replication. However, since it does not require a group membership service, semi-passive replication has a considerably lower cost in case of failure. As explained in the paper, this technique can benefit from an aggressive time-out value significantly lower than what a group membership allows. As a result, the reaction to crashes is greatly improved. The semi-passive replication algorithm uses failure detectors. The algorithm given in the paper is analysed in the failure free case and in the case of one server crash. The response time (for the client) of these two scenarios is analysed through simulation.","PeriodicalId":376253,"journal":{"name":"Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115635872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 134

Load balancing of dynamic and adaptive mesh-based computations 基于动态和自适应网格计算的负载平衡

Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281) Pub Date : 1998-10-20 DOI: 10.1109/RELDIS.1998.740514

K. Schloegel, G. Karypis, Vipin Kumar

{"title":"Load balancing of dynamic and adaptive mesh-based computations","authors":"K. Schloegel, G. Karypis, Vipin Kumar","doi":"10.1109/RELDIS.1998.740514","DOIUrl":"https://doi.org/10.1109/RELDIS.1998.740514","url":null,"abstract":"One ingredient which is viewed as vital to the successful conduct of many large-scale numerical simulations is the ability to dynamically repartition the underlying adaptive finite element mesh among the processors so that the computations are balanced and interprocessor communication is minimized. We present two new schemes for adaptive repartitioning: Locally-Matched Multilevel Scratch-Remap (or LMSR) and Wavefront Diffusion. The LMSR scheme performs purely local coarsening and partition remapping in a multilevel context. In Wavefront Diffusion, the flow of vertices move in a wavefront from overbalanced to underbalanced domains. We present experimental evaluations of our LMSR and Wavefront Diffusion algorithms on synthetically generated adaptive meshes as well as on some application meshes. We show that our LMSR algorithm decreases the amount of vertex migration required to balance the graph and produces repartitionings of similar quality compared to current scratch-remap schemes. Furthermore, we show that our LMSR algorithm is more scalable in terms of execution time compared to current scratch-remap schemes. We show that our Wavefront Diffusion algorithm obtains significantly lower vertex migration requirements, while maintaining similar edge-cut results compared to current multilevel diffusion algorithms, especially for highly imbalanced graphs.","PeriodicalId":376253,"journal":{"name":"Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115065983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A multiprocessor scheduling algorithm for low overhead fault-tolerance 一种低开销容错的多处理器调度算法

Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281) Pub Date : 1998-10-20 DOI: 10.1109/RELDIS.1998.740493

Koji Hashimoto, Tatsuhiro Tsuchiya, T. Kikuno

引用次数: 5