Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004.最新文献_第2页

On the progress in fault-tolerant real-time computing 容错实时计算的研究进展

Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004. Pub Date : 2004-10-18 DOI: 10.1109/RELDIS.2004.1353008

P. Ezhilchelvan

引用次数: 1

Token-based atomic broadcast using unreliable failure detectors 使用不可靠故障检测器的基于令牌的原子广播

Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004. Pub Date : 2004-10-18 DOI: 10.1109/RELDIS.2004.1353003

Richard Ekwall, A. Schiper, P. Urbán

引用次数: 38

Hardware support for high performance, intrusion- and fault-tolerant systems 硬件支持高性能，入侵和容错系统

Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004. Pub Date : 2004-10-18 DOI: 10.1109/RELDIS.2004.1353020

G. P. Saggese, C. Basile, L. Romano, Z. Kalbarczyk, R. Iyer

{"title":"Hardware support for high performance, intrusion- and fault-tolerant systems","authors":"G. P. Saggese, C. Basile, L. Romano, Z. Kalbarczyk, R. Iyer","doi":"10.1109/RELDIS.2004.1353020","DOIUrl":"https://doi.org/10.1109/RELDIS.2004.1353020","url":null,"abstract":"The paper proposes a combined hardware/software approach for realizing high performance, intrusion- and fault-tolerant services. The approach is demonstrated for (yet not limited to) an attribute authority server, which provides a compelling application due to its stringent performance and security requirements. The key element of the proposed architecture is an FPGA-based, parallel crypto-engine providing (1) optimally dimensioned RSA Processors for efficient execution of computationally intensive RSA signatures and (2) a KeyStore facility used as tamper-resistant storage for preserving secret keys. To achieve linear speed-up (with the number of RSA Processors) and deadlock-free execution in spite of resource-sharing and scheduling/synchronization issues, we have resorted to a number of performance enhancing techniques (e.g., use of different clock domains, optimal balance between internal and external parallelism) and have formally modeled and mechanically proved our crypto-engine with the Spin model checker. At the software level, the architecture combines active replication and threshold cryptography, but in contrast with previous work, the code of our replicas is multithreaded so it can efficiently use an attached parallel crypto-engine to compute an attribute authority partial signature (as required by threshold cryptography). Resulting replicated systems that exhibit nondeterministic behavior, which cannot be handled with conventional replication approaches. Our architecture is based on a preemptive deterministic scheduling algorithm to govern scheduling of replica threads and guarantee strong replica consistency.","PeriodicalId":142327,"journal":{"name":"Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004.","volume":"201 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115711519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Run-time monitoring for dependable systems: an approach and a case study 可靠系统的运行时监控:一种方法和案例研究

Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004. Pub Date : 2004-10-18 DOI: 10.1109/RELDIS.2004.1353002

Sérgio Ricardo Rota, J. R. Almeida

引用次数: 14

Dependable pervasive systems 可靠的普及系统

Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004. Pub Date : 2004-10-18 DOI: 10.1109/RELDIS.2004.1352998

B. Randell

{"title":"Dependable pervasive systems","authors":"B. Randell","doi":"10.1109/RELDIS.2004.1352998","DOIUrl":"https://doi.org/10.1109/RELDIS.2004.1352998","url":null,"abstract":"Summary form only given. Present trends indicate that huge networked computer systems are likely to become pervasive, as information technology is embedded into virtually everything, and to be required to function essentially continuously. I believe that even today's (underused) \"best practice\" regarding the achievement of high dependability - reliability, availability, security, safety, etc. - from large networked computer systems will not suffice for future pervasive systems. I will give my perspective on the current state of research into the four basic dependability technologies: (i) fault prevention (to avoid the occurrence or introduction of faults), (ii) fault removal (through validation and verification), (iii) fault tolerance (so that failures do not necessarily occur even if faults remain), and (iv) fault forecasting (the means of assessing progress towards achieving adequate dependability). I will then argue that much further research is required on all four dependability technologies in order to cope with pervasive systems, identify some priorities, and discuss how this research could best be aimed at making system dependability into a \"commodity\" that industry can value and from which it can profit.","PeriodicalId":142327,"journal":{"name":"Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004.","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127188379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

A stability-oriented approach to improving BGP convergence 一种以稳定性为导向的提高BGP收敛性的方法

Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004. Pub Date : 2004-10-18 DOI: 10.1109/RELDIS.2004.1353006

Hongwei Zhang, A. Arora, Zhijun Liu

{"title":"A stability-oriented approach to improving BGP convergence","authors":"Hongwei Zhang, A. Arora, Zhijun Liu","doi":"10.1109/RELDIS.2004.1353006","DOIUrl":"https://doi.org/10.1109/RELDIS.2004.1353006","url":null,"abstract":"This paper shows that the elimination of fault-agnostic instability, the instability caused by fault-agnostic distributed control, substantially improves BGP convergence speed. To this end, we first classify BGP convergence instability into two categories: fault-agnostic instability and distribution-inherent instability; secondly, we prove the impossibility of eliminating all distribution-inherent instability in distributed routing protocols; thirdly, we design the grapevine border gateway protocol (G-BGP) to show that all fault-agnostic instability can be eliminated. G-BGP eliminates all fault-agnostic instability under different fault and routing policy scenarios by (i) piggybacking onto BGP UPDATE messages fine-grained information about faults to the nodes affected by the faults, (ii) quickly resolving the uncertainty between link and node failure as well as the uncertainty of whether a node has changed route, and (iii) rejecting obsolete fault information. We have evaluated G-BGP by both analysis and simulation. Analytically, we prove that, by eliminating fault-agnostic instability, G-BGP achieves optimal convergence speed in several scenarios where BGP convergence is severely delayed (e.g., when a node or a link fail-stops), and when the shortest-path-first policy is used, G-BGP asymptotically improves BGP convergence speed except in scenarios where BGP convergence speed is already optimal (e.g., when a node or a link joins). By simulating networks with up to 115 autonomous systems, we observe that G-BGP improves BGP convergence stability and speed by an order of magnitude.","PeriodicalId":142327,"journal":{"name":"Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134320220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Balancing the tradeoffs between data accessibility and query delay in ad hoc networks 在自组织网络中平衡数据可访问性和查询延迟之间的权衡

Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004. Pub Date : 2004-10-18 DOI: 10.1109/RELDIS.2004.1353029

Liangzhong Yin, G. Cao

引用次数: 46

A signal processing approach to global predicate monitoring 一种全局谓词监测的信号处理方法

Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004. Pub Date : 2004-10-18 DOI: 10.1109/RELDIS.2004.1353014

N. Ghafari, R. Seviora

{"title":"A signal processing approach to global predicate monitoring","authors":"N. Ghafari, R. Seviora","doi":"10.1109/RELDIS.2004.1353014","DOIUrl":"https://doi.org/10.1109/RELDIS.2004.1353014","url":null,"abstract":"Global predicate evaluation is a fundamental problem in distributed systems. This paper views it from a different perspective, namely that of the signals and systems area of electrical engineering. It adapts a signal processing approach to address this problem in the context of monitoring of 'health' of a software system. The global state of the system is viewed as a 'state' signal which evolves over time. The distributed processes are assumed to possess roughly synchronized clocks. The states of individual processes are periodically sampled and reported to a global monitor. The observed system state constructed by the global monitor is viewed as being composed of two components - the consistent global states and an error signal due to the messages in transit and differences in the local clocks. The global monitor removes the error signal by processing the observed global signal through a low-pass filter. It evaluates the predicates on the filtered signal. The approach presented is applicable to distributed systems which are semi-stationary, i.e. whose internal states of interest remain stable over comparatively long intervals of time. The paper presents the relevant signal processing concepts (p-spectrum and p-filtering), outlines an architecture for global predicate monitoring and describes the signal processing done in the global monitor. The paper then summarizes an evaluation of the approach presented on a small computer aided vehicle dispatch system. The evaluation experiments are described and the results are presented and analyzed.","PeriodicalId":142327,"journal":{"name":"Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004.","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126570244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Proactive hot spot avoidance for Web server dependability 针对Web服务器可靠性的主动热点避免

Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004. Pub Date : 2004-10-18 DOI: 10.1109/RELDIS.2004.1353031

P. Felber, T. Kaldewey, S. Weiss

{"title":"Proactive hot spot avoidance for Web server dependability","authors":"P. Felber, T. Kaldewey, S. Weiss","doi":"10.1109/RELDIS.2004.1353031","DOIUrl":"https://doi.org/10.1109/RELDIS.2004.1353031","url":null,"abstract":"Flash crowds, which result from the sudden increase in popularity of some online content, are among the most important problems that plague today's Internet. Affected servers are overloaded with requests and quickly become \"hot spots.\" They usually suffer from severe performance failures or stop providing service altogether, as there are scarcely any effective techniques to scalably deliver content under hot spot conditions to all requesting clients. In this paper, we propose and evaluate collaborative techniques to detect and proactively avoid the occurrence of hot spots. Using our mechanisms, groups of small- to medium-sized Web servers can team up to withstand unexpected surges of requests in a cost-effective manner. Once a Web server detects a sudden increase in request traffic, it replicates on-the-fly the affected content on other Web servers; subsequent requests are transparently redirected to the copies to offload the primary server. Each server acts both as a primary source for its own content, and as a secondary source for other servers' content in the event of a flash-crowd; scalability and dependability are therefore achieved in a peer-to-peer fashion, with each peer contributing to, and benefiting from, the service. Our proactive hot spot avoidance techniques are implemented as a module for the popular Apache Web server. We have conducted a comprehensive experimental evaluation, which demonstrates that our techniques are effective at dealing with flash crowds and scaling to very high request loads.","PeriodicalId":142327,"journal":{"name":"Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004.","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122832307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Design and evaluation of a QoS-adaptive system for reliable multicasting 可靠组播qos自适应系统的设计与评价

Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004. Pub Date : 2004-10-18 DOI: 10.1109/RELDIS.2004.1353001

Antonio Di Ferdinando, P. Ezhilchelvan, I. Mitrani

引用次数: 4