2008 Symposium on Reliable Distributed Systems最新文献

筛选
英文 中文
Gumshoe: Diagnosing Performance Problems in Replicated File-Systems 诊断复制文件系统中的性能问题
2008 Symposium on Reliable Distributed Systems Pub Date : 2008-10-06 DOI: 10.1109/SRDS.2008.35
Soila Kavulya, R. Gandhi, P. Narasimhan
{"title":"Gumshoe: Diagnosing Performance Problems in Replicated File-Systems","authors":"Soila Kavulya, R. Gandhi, P. Narasimhan","doi":"10.1109/SRDS.2008.35","DOIUrl":"https://doi.org/10.1109/SRDS.2008.35","url":null,"abstract":"Replicated file-systems can experience degraded performance that might not be adequately handled by the underlying fault-tolerant protocols. We describe the design and implementation of Gumshoe, a system that aims to diagnose performance problems in replicated file-systems. Gumshoe periodically gathers OS and protocol metrics and then analyzes these metrics to automatically localize the performance problem to the culprit node(s). We describe our results and experiences with problem diagnosis in two replicated file-systems (replicated-CoreFS and BFS) using two file-system benchmarks (Postmark and IOzone).","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"154 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123242802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Fault-Tolerant Coverage Planning in Wireless Networks 无线网络中的容错覆盖规划
2008 Symposium on Reliable Distributed Systems Pub Date : 2008-10-06 DOI: 10.1109/SRDS.2008.14
S. Ivanov, E. Nett
{"title":"Fault-Tolerant Coverage Planning in Wireless Networks","authors":"S. Ivanov, E. Nett","doi":"10.1109/SRDS.2008.14","DOIUrl":"https://doi.org/10.1109/SRDS.2008.14","url":null,"abstract":"Typically wireless networks coverage is planned with static redundancy to compensate temporal variations in the environment. As a result, the service still is delivered but the network coverage could have entered a critical state, meaning that further changes in the environment may lead to service failure. Service failures have to be explicitly notified by the applications. Therefore, in this paper we propose a methodology for fault-tolerant coverage planning. The idea is detecting the critical state and removing it by on-line system reconfiguration, and restoration of the original static redundancy. Even in case of a failure the system automatically generates a new configuration to restore the service, leading to shorter repair times. We describe how this approach can be applied to wireless mesh networks, often used in industrial applications like manufacturing, automation and logistics. The evaluation results show that the underlying model used for error detection and system recovery is accurate enough to correctly identify the system state.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123754472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
An Autonomic Approach for Replication of Internet-based Services 基于internet的服务复制的自主方法
2008 Symposium on Reliable Distributed Systems Pub Date : 2008-10-06 DOI: 10.1109/SRDS.2008.22
Damián Serrano, M. Patiño-Martínez, R. Jiménez-Peris, Bettina Kemme
{"title":"An Autonomic Approach for Replication of Internet-based Services","authors":"Damián Serrano, M. Patiño-Martínez, R. Jiménez-Peris, Bettina Kemme","doi":"10.1109/SRDS.2008.22","DOIUrl":"https://doi.org/10.1109/SRDS.2008.22","url":null,"abstract":"As more and more applications are deployed as Internet-based services, they have to be available anytime anywhere in a seamless manner. This requires the underlying infrastructure to provide scalability, fault tolerance and fast response times. While replicating the services and the data they access across sites that are located in different geographic regions is a promising means to achieve these requirements, data consistency is challenging if data continuously changes and queries are dynamic by nature, as is typical for e-commerce applications.Thus, current WAN replication solutions either trade performance for data consistency or are notable to scale in wide-area settings. In this paper, we present a novel approach to provide performance and consistency for Internet services. One of the main contributions is an autonomic replica placement module that places data copies only on servers close to clients that actually need them. The goal is to find the right trade-off between fast local access and the overhead of keeping data copies consistent. As data access patterns might change over time, reconfiguration is done periodically and online, i.e., allowing sites to receive new data copies or drop data copies while at the same time transaction processing continues in the system.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121962208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Extending Paxos/LastVoting with an Adequate Communication Layer for Wireless Ad Hoc Networks 扩展Paxos/LastVoting,为无线自组织网络提供适当的通信层
2008 Symposium on Reliable Distributed Systems Pub Date : 2008-10-06 DOI: 10.1109/SRDS.2008.21
Fatemeh Borran, R. Prakash, A. Schiper
{"title":"Extending Paxos/LastVoting with an Adequate Communication Layer for Wireless Ad Hoc Networks","authors":"Fatemeh Borran, R. Prakash, A. Schiper","doi":"10.1109/SRDS.2008.21","DOIUrl":"https://doi.org/10.1109/SRDS.2008.21","url":null,"abstract":"Most papers addressing consensus in wireless ad hoc networks adopt system models similar to those developed for wired networks. These models are focused towards node failures while ignoring link failures, and thus are poorly suited for wireless ad hoc networks. The recently proposed HO model does not have this drawback. The paper shows that an existing algorithm and the HO model can be used for multi-hop wireless ad hoc networks, if extended with an adequate communication layer. The description of the communication layer is augmented with simulation results that validate the feasibility of our approach and provide better understanding of the behavior of wireless environments.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133993482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Towards Reliable Reputations for Dynamic Networked Systems 迈向动态网络系统的可靠声誉
2008 Symposium on Reliable Distributed Systems Pub Date : 2008-10-06 DOI: 10.1109/SRDS.2008.31
Gayatri Swamynathan, Ben Y. Zhao, K. Almeroth, S. Rao Jammalamadaka
{"title":"Towards Reliable Reputations for Dynamic Networked Systems","authors":"Gayatri Swamynathan, Ben Y. Zhao, K. Almeroth, S. Rao Jammalamadaka","doi":"10.1109/SRDS.2008.31","DOIUrl":"https://doi.org/10.1109/SRDS.2008.31","url":null,"abstract":"A new generation of distributed systems and applications rely on the cooperation of diverse user populations motivated by self-interest. While they can utilize \"reputation systems\" to reduce selfish behaviors that disrupt or manipulate the network for personal gain, current reputations face a key challenge in large dynamic networks: vulnerability to peer collusion. In this paper, we propose to dramatically improve the accuracy of reputation systems with the use of a statistical metric that measures the \"reliability\" of a peer's reputation taking into account collusion-like behavior. Trace-driven simulations on P2P network traffic show that our reliability metric drastically improves system performance. We also apply our metric to 18,000 randomly selected eBay user reputation profiles, and surprisingly discover numerous users with collusion-like behaviors worthy of additional investigation.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128009254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Application-Level Recovery Mechanisms for Context-Aware Pervasive Computing 上下文感知普适计算的应用级恢复机制
2008 Symposium on Reliable Distributed Systems Pub Date : 2008-10-06 DOI: 10.1109/SRDS.2008.13
D. Kulkarni, A. Tripathi
{"title":"Application-Level Recovery Mechanisms for Context-Aware Pervasive Computing","authors":"D. Kulkarni, A. Tripathi","doi":"10.1109/SRDS.2008.13","DOIUrl":"https://doi.org/10.1109/SRDS.2008.13","url":null,"abstract":"We identify here various kinds of failure conditions and robustness issues that arise in context-aware pervasive computing applications. Such conditions are related to failures in an application's interactions with ambient services, failures in resource discovery and binding, and invalidation of context conditions during the execution of an application task. In this paper we present an exception handling model for integrating forward error recovery mechanisms in the designs of such applications. This model is integrated in a role-based framework and supported by a programming environment for construction of such applications.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131333031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Formalizing System Behavior for Evaluating a System Hang Detector 形式化评估系统挂起检测器的系统行为
2008 Symposium on Reliable Distributed Systems Pub Date : 2008-10-06 DOI: 10.1109/SRDS.2008.11
Long Wang, Z. Kalbarczyk, R. Iyer
{"title":"Formalizing System Behavior for Evaluating a System Hang Detector","authors":"Long Wang, Z. Kalbarczyk, R. Iyer","doi":"10.1109/SRDS.2008.11","DOIUrl":"https://doi.org/10.1109/SRDS.2008.11","url":null,"abstract":"This paper presents an approach to formally verify the detection capability of a system hang detector. To achieve this goal, an abstract formal model of a typical Linux system is created to thoroughly exercise all execution scenarios that may lead to hangs. The goal is to expose cases (i.e., hang scenarios) that escape detection. Our system model abstracts the basic hardware (e.g., timer, hardware counter) and software (e.g., processes/threads) components present in the Linux system. The model enables: (i) capturing behavior of these components so as to depict execution scenarios that lead to hangs, and (ii) evaluating hang detection coverage. Explicit-state model checking is applied to reason about system behavior and uncover hang scenarios that escape detection. The results indicate that the proposed framework allows identification of corner cases of hang scenarios that escape detection and provides valuable insight to developers for enhancing detection mechanisms.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127112331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Dynamically Quantifying and Improving the Reliability of Distributed Storage Systems 动态量化与提高分布式存储系统的可靠性
2008 Symposium on Reliable Distributed Systems Pub Date : 2008-10-06 DOI: 10.1109/SRDS.2008.36
Rekha Bachwani, Leszek Gryz, R. Bianchini, C. Dubnicki
{"title":"Dynamically Quantifying and Improving the Reliability of Distributed Storage Systems","authors":"Rekha Bachwani, Leszek Gryz, R. Bianchini, C. Dubnicki","doi":"10.1109/SRDS.2008.36","DOIUrl":"https://doi.org/10.1109/SRDS.2008.36","url":null,"abstract":"In this paper, we argue that the reliability of large-scale storage systems can be significantly improved by using better reliability metrics and more efficient policies for recovering from hardware failures. Specifically, we make three main contributions. First, we introduce NDS (Normalcy Deviation Score), a new metric for dynamically quantifying the reliability status of a storage system. Second, we propose MinI (Minimum Intersection), a novel recovery scheduling policy that improves reliability by efficiently reconstructing data after a hardware failure. MinI uses NDS to tradeoff reliability and performance in making its scheduling decisions. Third, we evaluate NDS and MinI for three common data-allocation schemes and a number of different parameters. Our evaluation focuses on a distributed storage system based on erasure codes. We find that MinI improves reliability significantly, as compared to conventional policies.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127519604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
An Absolute-Relative Risk Assessment Methodology Approach to Current Safety Critical Systems and its Application to the ADS-B based Air Traffic Control System 当前安全关键系统的绝对相对风险评估方法及其在ADS-B空中交通管制系统中的应用
2008 Symposium on Reliable Distributed Systems Pub Date : 2008-10-06 DOI: 10.1109/SRDS.2008.24
L. Vismari, J. Camargo
{"title":"An Absolute-Relative Risk Assessment Methodology Approach to Current Safety Critical Systems and its Application to the ADS-B based Air Traffic Control System","authors":"L. Vismari, J. Camargo","doi":"10.1109/SRDS.2008.24","DOIUrl":"https://doi.org/10.1109/SRDS.2008.24","url":null,"abstract":"This work presents a risk assessment methodology, preliminary proposed in [1], which is the fusion of the \"absolute\" and the \"relative\" risk assessment methods adopted by the International Civil Aviation Organization. The proposed methodology uses the Fluid Stochastic Petri Net (FSPN) as modeling formalism, and compares the safety metrics estimated from the simulation of both the proposed and the legacy system models. It was applied to assess the safety properties of a new air traffic surveillance concept, named \"automatic dependent surveillance - broadcasting\" (ADS-B). As conclusions, the proposed methodology assured to assess the safety properties of systems based on the current safety critical system paradigm - especially concerning the air transportation system. Besides, the FSPN formalism provided important modeling capabilities and discrete event simulation allowing estimating the desired safety metrics. Finally, the ADS-B (proposed system) has significantly reduced the risks of separation losses between aircrafts if compared to the usual surveillance radar systems (legacy system) in air traffic control (ATC) environment.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129257975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Probabilistic Analysis of Snapshot Isolation with Partial Replication 带部分复制的快照隔离的概率分析
2008 Symposium on Reliable Distributed Systems Pub Date : 2008-10-06 DOI: 10.1109/SRDS.2008.10
J. M. Bernabé-Gisbert, Vaide Zuikeviciute, F. D. Muñoz-Escoí, F. Pedone
{"title":"A Probabilistic Analysis of Snapshot Isolation with Partial Replication","authors":"J. M. Bernabé-Gisbert, Vaide Zuikeviciute, F. D. Muñoz-Escoí, F. Pedone","doi":"10.1109/SRDS.2008.10","DOIUrl":"https://doi.org/10.1109/SRDS.2008.10","url":null,"abstract":"Snapshot isolation has received a considerable amount of attention in the context of full database replication. Such popularity is mainly because read-only transactions executing under snapshot isolation are never blocked or aborted. In partial replication, where each replica holds only a part of the database, transactions may require access to remote databases. Each remote read operation of the transaction must execute in a consistent global database snapshot as the local operations; if such a snapshot is not available, the transaction must be aborted. In this paper we are interested in the effects of distributed transactions on the abort rate of partially replicated snapshot isolation systems. We present a simple probabilistic analysis of transaction abort rates for two different concurrency control mechanisms: lock- and version-based. The former models the behavior of a replication protocol providing one-copy-serializability; the latter models snapshot isolation. Our analysis reveals that in the version-based system the execution abort rate decreases exponentially as the number of data versions available increases. As a consequence, in all cases considered, two versions of each data item were sufficient to eliminate aborts due to distributed transactions.","PeriodicalId":397103,"journal":{"name":"2008 Symposium on Reliable Distributed Systems","volume":"242 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122128761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信