2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)最新文献

Being Accurate Is Not Enough: New Metrics for Disk Failure Prediction 仅仅准确是不够的:磁盘故障预测的新指标

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.019

Jing Li, Rebecca J. Stones, G. Wang, Zhongwei Li, X. Liu, Kang Xiao

{"title":"Being Accurate Is Not Enough: New Metrics for Disk Failure Prediction","authors":"Jing Li, Rebecca J. Stones, G. Wang, Zhongwei Li, X. Liu, Kang Xiao","doi":"10.1109/SRDS.2016.019","DOIUrl":"https://doi.org/10.1109/SRDS.2016.019","url":null,"abstract":"Traditionally, disk failure prediction accuracy is used to evaluate disk failure prediction model. However, accuracy may not reflect their practical usage (protecting against failures, rather than only predicting failures) in cloud storage systems. In this paper, we propose two new metrics for disk failure prediction models: migration rate, which measures how much at-risk data is protected as a result of correct failure predictions, and mismigration rate, which measures how much data is migrated needlessly as a result of false failure predictions. To demonstrate their effectiveness, we compare disk failure prediction methods: (a) a classification tree (CT) model vs. a state-of-the-art recurrent neural network (RNN) model, and (b) a proposed residual life prediction model based on gradient boosted regression trees (GBRTs) vs. RNN. While prediction accuracy experiments favor the RNN model, migration rate experiments can favor the CT and GBRT models (depending on transfer rates). We conclude that prediction accuracy can be a misleading metric. Moreover, the proposed GBRT model offers a practical improvement in disk failure prediction in real-world data centers.","PeriodicalId":165721,"journal":{"name":"2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126956097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 36

Adaptive Location Privacy with ALP 自适应位置隐私与ALP

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.044

Vincent Primault, A. Boutet, Sonia Ben Mokhtar, L. Brunie

{"title":"Adaptive Location Privacy with ALP","authors":"Vincent Primault, A. Boutet, Sonia Ben Mokhtar, L. Brunie","doi":"10.1109/SRDS.2016.044","DOIUrl":"https://doi.org/10.1109/SRDS.2016.044","url":null,"abstract":"With the increasing amount of mobility data being collected on a daily basis by location-based services (LBSs) comes a new range of threats for users, related to the over-sharing of their location information. To deal with this issue, several location privacy protection mechanisms (LPPMs) have been proposed in the past years. However, each of these mechanisms comes with different configuration parameters that have a direct impact both on the privacy guarantees offered to the users and on the resulting utility of the protected data. In this context, it can be difficult for non-expert system designers to choose the appropriate configuration to use. Moreover, these mechanisms are generally configured once for all, which results in the same configuration for every protected piece of information. However, not all users have the same behaviour, and even the behaviour of a single user is likely to change over time. To address this issue, we present in this paper ALP (which stands for Adaptive Location Privacy), a new framework enabling the dynamic configuration of LPPMs. ALP can be used in two scenarios: (1) offline, where ALP enables a system designer to choose and automatically tune the most appropriate LPPM for the protection of a given dataset, (2) online, where ALP enables the user of a crowd sensing application to protect consecutive batches of her geolocated data by automatically tuning a given LPPM to fulfil a set of privacy and utility objectives. We evaluate ALP on both scenarios with two real-life mobility datasets and two state-of-the-art LPPMs. Our experiments show that the adaptive LPPM configurations found by ALP outperform static configurations in terms of trade-off between privacy and utility.","PeriodicalId":165721,"journal":{"name":"2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126558733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Achieving High Reliability via Expediting the Repair of Critical Blocks in Replicated Storage Systems 通过快速修复复制存储系统中的关键块实现高可靠性

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.018

Juntao Fang, Shenggang Wan, Ping-Hsiu Huang, Xubin He, C. Xie

{"title":"Achieving High Reliability via Expediting the Repair of Critical Blocks in Replicated Storage Systems","authors":"Juntao Fang, Shenggang Wan, Ping-Hsiu Huang, Xubin He, C. Xie","doi":"10.1109/SRDS.2016.018","DOIUrl":"https://doi.org/10.1109/SRDS.2016.018","url":null,"abstract":"High reliability is critical to large data centers consisting of hundreds to thousands of storage nodes where node failures are not rare. Data replication is a typical technique deployed to achieve high reliability. When a node failure is detected, blocks with lost replicas are identified and recovered. Long timeouts are usually used for node failure detection. For blocks with one lost replica, the long timeouts can significantly reduce network traffic induced by data recovery. However, for blocks with two or more lost replicas, which can be caused by concurrent node failures that are not rare in large data centers, the long timeouts will result in a high risk of loss of these blocks. In this paper, we propose MFR to separate the identification of the blocks with two or more lost replicas from that of the blocks with one lost replica in a way that the identification of the blocks with two or more replicas can be accelerated while that of the blocks with one lost replica stays the same. Consequently, MFR can significantly improve data reliability while keeping the network traffic induced by data recovery stable. The results from our simulation and prototype implementation show that MFR improves the reliability of storage systems by a factor of up to 4.0 in terms of mean time to data loss. As blocks with two or more lost replicas are far fewer than blocks with one lost replica, the extra network traffic caused by MFR is less than 0.54% of total network traffic for data recovery.","PeriodicalId":165721,"journal":{"name":"2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131176416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ProCode: A Proactive Erasure Coding Scheme for Cloud Storage Systems ProCode:面向云存储系统的主动Erasure编码方案

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.039

Peng Li, Jing Li, Rebecca J. Stones, G. Wang, Zhongwei Li, X. Liu

{"title":"ProCode: A Proactive Erasure Coding Scheme for Cloud Storage Systems","authors":"Peng Li, Jing Li, Rebecca J. Stones, G. Wang, Zhongwei Li, X. Liu","doi":"10.1109/SRDS.2016.039","DOIUrl":"https://doi.org/10.1109/SRDS.2016.039","url":null,"abstract":"Common distributed storage systems use data replication to improve system reliability and maintain data availability, but at the cost of disk storage. In order to lower storage costs, data may instead be stored according to erasure codes, but this results in greater network and disk traffic when data blocks are reconstructed following an erasure. These methods are also passive, i.e., they only reconstruct data after failures occur. In this paper, we present a proactive erasure coding scheme (ProCode). We monitor the health of disks via drive failure prediction and automatically adjust the replication factor of data blocks on at-risk disks to ensure data safety. In this way, we achieve fast recovery after disk failures without significantly increasing the storage overhead. ProCode is implemented as an extension to HDFS-RAID used by Facebook. Compared with replication storage and erasure coding, ProCode improves system reliability and availability. Specifically, experimental results show 2 or more orders of magnitude reduction in the average number of data loss events over a 10- year period, a 63% or greater drop in degraded read latency, and a 78% drop in recovery time.","PeriodicalId":165721,"journal":{"name":"2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131331125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

Who's On Board?: Probabilistic Membership for Real-Time Distributed Control Systems 谁在船上?实时分布式控制系统的概率隶属度

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.029

R. Guerraoui, David Kozhaya, M. Oriol, Y. Pignolet

引用次数: 5

Experiments with Self-Stabilizing Distributed Data Fusion 自稳定分布式数据融合实验

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.046

B. Ducourthial, V. Berge-Cherfaoui

引用次数: 7

Model-Checking Assisted Protocol Design for Ultra-reliable Low-Latency Wireless Networks 超可靠低延迟无线网络模型检测辅助协议设计

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.048

Christian Dombrowski, Sebastian Junges, J. Katoen, J. Gross

{"title":"Model-Checking Assisted Protocol Design for Ultra-reliable Low-Latency Wireless Networks","authors":"Christian Dombrowski, Sebastian Junges, J. Katoen, J. Gross","doi":"10.1109/SRDS.2016.048","DOIUrl":"https://doi.org/10.1109/SRDS.2016.048","url":null,"abstract":"Recently, the wireless networking community is getting more and more interested in novel protocol designs for safety-critical applications. These new applications come with unprecedented latency and reliability constraints which poses many open challenges. A particularly important one relates to the question how to develop such systems. Traditionally, development of wireless systems has mainly relied on simulations to identify viable architectures. However, in this case the drawbacks of simulations – in particular increasing run-times – rule out its application. Instead, in this paper we propose to use probabilistic model checking, a formal model-based verification technique, to evaluate different system variants during the design phase. Apart from allowing evaluations and therefore design iterations with much smaller periods, probabilistic model checking provides bounds on the reliability of the considered design choices. We demonstrate these salient features with respect to the novel EchoRing protocol, which is a token-based system designed for safety-critical industrial applications. Several mechanisms for dealing with a token loss are modeled and evaluated through probabilistic model checking, showing its potential as suitable evaluation tool for such novel wireless protocols. In particular, we show by probabilistic model checking that wireless token-passing systems can benefit tremendously from the considered fault-tolerant methods. The obtained performance guarantees for the different mechanisms even provide reasonable bounds for experimental results obtained from a real-world implementation.","PeriodicalId":165721,"journal":{"name":"2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121731355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

UDS: A Novel and Flexible Scheduling Algorithm for Deterministic Multithreading UDS:一种新的、灵活的确定性多线程调度算法

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.030

F. Hauck, Gerhard Habiger, Jörg Domaschka

引用次数: 3

Network Aware Reliability Analysis for Distributed Storage Systems 分布式存储系统网络感知可靠性分析

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.042

Amir Epstein, E. K. Kolodner, D. Sotnikov

引用次数: 6

TANGO: Toward a More Reliable Mobile Streaming through Cooperation between Cellular Network and Mobile Devices 探戈:通过蜂窝网络和移动设备之间的合作实现更可靠的移动流

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.047

Nawanol Theera-Ampornpunt, Tarun Mangla, S. Bagchi, R. Panta, Kaustubh R. Joshi, M. Ammar, E. Zegura

{"title":"TANGO: Toward a More Reliable Mobile Streaming through Cooperation between Cellular Network and Mobile Devices","authors":"Nawanol Theera-Ampornpunt, Tarun Mangla, S. Bagchi, R. Panta, Kaustubh R. Joshi, M. Ammar, E. Zegura","doi":"10.1109/SRDS.2016.047","DOIUrl":"https://doi.org/10.1109/SRDS.2016.047","url":null,"abstract":"Multimedia streaming is a major mobile application, accounting for more than half of total mobile traffic. Streaming applications usually have a static buffering strategy. For example, buffer size is limited to x minutes of the stream, where x is optimized to provide the best trade-off between minimizing stalls and limiting waste of user's bandwidth and energy resulting from user abandonment. We show that such strategies based on information available on the mobile device alone do not work well when network conditions change dynamically, e.g., connectivity degrades due to congestion. We propose an alternative strategy using the framework called TANGO, based on a novel idea of cooperation between cellular network and mobile devices. By monitoring real-time network conditions and continuously predicting user location, our system is able to predict connectivity degradation in the near term. In such events, a notification is sent to the mobile device so that the streaming application can initiate a mitigation action, such as to pre-cache more content. In simulations based on real user traces, we found that TANGO reduces pause time by 13–72%, significantly outperforming DASH, which is the current state of the art.","PeriodicalId":165721,"journal":{"name":"2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125680768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2