{"title":"Battling Bad Bits with Checksums in the Loris Page Cache","authors":"D. V. Moolenbroek, Raja Appuswamy, A. Tanenbaum","doi":"10.1109/LADC.2013.10","DOIUrl":"https://doi.org/10.1109/LADC.2013.10","url":null,"abstract":"In this paper, we aim to improve the reliability of a central part of the operating system storage stack: the page cache. We consider two reliability threats: memory errors, where bits in DRAM are flipped due to cosmic rays, and software bugs, where programming errors may ultimately result in data corruption and crashes. We argue that by making use of checksums, we can significantly reduce the probability that either threat results in any application-visible effects. In particular, we can use checksums to detect memory corruption as well as validate the integrity of the cache's internal state for recovery after a crash. We show that in many cases, we can avoid the overhead of computing checksums especially for these purposes. We implement our ideas in the Loris storage stack. Our analysis and evaluation show that our approach improves the overall reliability of the cache at relatively little added cost.","PeriodicalId":243515,"journal":{"name":"2013 Sixth Latin-American Symposium on Dependable Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129069463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Slicing as a Distributed Systems Primitive","authors":"Francisco Maia, M. Matos, R. Oliveira, E. Rivière","doi":"10.1109/LADC.2013.21","DOIUrl":"https://doi.org/10.1109/LADC.2013.21","url":null,"abstract":"Large-scale distributed systems appear as the major infrastructures for supporting planet-scale services. These systems call for appropriate management mechanisms and protocols. Slicing is an example of an autonomous, fully decentralized protocol suitable for large-scale environments. It aims at organizing the system into groups of nodes, called slices, according to an application-specific criteria where the size of each slice is relative to the size of the full system. This allows assigning a certain fraction of nodes to different task, according to their capabilities. Although useful, current slicing techniques lack some features of considerable practical importance. This paper proposes a slicing protocol, that builds on existing solutions, and addresses some of their frailties. We present novel solutions to deal with non-uniform slices and to perform online and dynamic slices schema reconfiguration. Moreover, we describe how to provision a slice-local Peer Sampling Service for upper protocol layers and how to enhance slicing protocols with the capability of slicing over more than one attribute. Slicing is presented as a complete, dependable and integrated distributed systems primitive for large-scale systems.","PeriodicalId":243515,"journal":{"name":"2013 Sixth Latin-American Symposium on Dependable Computing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134393988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Figueiras, Jesper Grønbæk, H. Schwefel, A. Bondavalli
{"title":"Multi-hypothesis GPS and Electronic Fence Data Fusion for Safety-Critical Positioning in Railway Worksites","authors":"J. Figueiras, Jesper Grønbæk, H. Schwefel, A. Bondavalli","doi":"10.1109/LADC.2013.23","DOIUrl":"https://doi.org/10.1109/LADC.2013.23","url":null,"abstract":"Safety-critical applications often use position information as a mean of assessing the safety level of people. For this reason, such information is required to be precise in terms of accuracy and timeliness. This paper regards position mechanisms for personalized warning systems for railway workers. Position accuracy for safety assessment purposes is defined as the precise identification whether the worker is located in a dangerous or safe zone within a certain worksite. This paper extends a previous publication from the same authors to a scenario with multiple workers, while analyzing the combination of wearable GPS receivers and electronic fences strategically placed at the worksite. The proposed data fusion algorithm comprises a Kalman Filter (KF) for filtering GPS observations and a Hidden Markov Model (HMM) for fusion with fence data. A Multiple-Hypothesis Tracking (MHT) mechanism is used to handle multiple workers within the worksite as a mean to compensate the inability of the fence to distinguish the workers. The proposed solution is analyzed under experimental setups. The obtained results outperformed a GPS-only solution and the previously proposed solution by reducing or even removing false alarm and safety-related missed detection events.","PeriodicalId":243515,"journal":{"name":"2013 Sixth Latin-American Symposium on Dependable Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131605887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianwen Xiang, F. Machida, Kumiko Tadano, Kazuo Yanoo, Wei Sun, Y. Maeno
{"title":"A Static Analysis of Dynamic Fault Trees with Priority-AND Gates","authors":"Jianwen Xiang, F. Machida, Kumiko Tadano, Kazuo Yanoo, Wei Sun, Y. Maeno","doi":"10.1109/LADC.2013.14","DOIUrl":"https://doi.org/10.1109/LADC.2013.14","url":null,"abstract":"A PAND gate is a special AND gate of Dynamic Fault Trees (DFTs) where the input events must occur in a specific order for the occurrence of its output event. We present a transformation from a PAND gate to an AND gate with some dependent conditioning events, called CAND gate, provided that the dynamic behavior of the system can be modeled by a (semi-)Markov process. With the transformation, a DFT with only static Boolean logic gates and PAND gates can be transformed into a static fault tree, which opens up the way to employ efficient combinatorial analysis for the DFT. In addition, the PAND gate cannot model the priority relations between the events whose occurrences are not necessary for the output event. The inability has not been addressed before and it can be overcome by the proposed CAND gate.","PeriodicalId":243515,"journal":{"name":"2013 Sixth Latin-American Symposium on Dependable Computing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134478903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accelerating Online Model Checking","authors":"Mona Qanadilo, Sufyan Samara, Yuhong Zhao","doi":"10.1109/LADC.2013.20","DOIUrl":"https://doi.org/10.1109/LADC.2013.20","url":null,"abstract":"Online model checking is a lightweight verification technique to ensure at runtime the safety of the current execution trace of the system application under test. Doing model checking online suffers from the limited execution time allocated to each checking cycle. In this paper, we focus on accelerating online model checking so that as large the model space as possible can be explored in time. For this purpose, we introduce offline backward exploration so as to reduce the workload of online forward exploration. As a result, online model checking becomes online reach ability checking. SAT solver is used as verification engine for online model checking. We improve the performance of the SAT solver zChaff by optimizing and customizing zChaff with respect to the online model checking specific features. Moreover, we take advantage of the parallel feature and the multi-port memory available on FPGA chips. We present a new underlying architecture using 2 SAT solvers as verification engine for online model checking. We implement a quick prototype of the new underlying architecture for online model checking. Several experiments are done to test the performance of our online model checking.","PeriodicalId":243515,"journal":{"name":"2013 Sixth Latin-American Symposium on Dependable Computing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129738390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. P. Ziwich, Emanuel A. Schimidt, E. P. Duarte, Ingrid Jansch-Pôrto
{"title":"Diagnosis of Content Pollution in P2P Live Streaming Networks","authors":"R. P. Ziwich, Emanuel A. Schimidt, E. P. Duarte, Ingrid Jansch-Pôrto","doi":"10.1109/LADC.2013.13","DOIUrl":"https://doi.org/10.1109/LADC.2013.13","url":null,"abstract":"Content pollution is one of the challenges for massively deploying live streaming P2P networks in the Internet. As the peers themselves are responsible to retransmit data, there is no trivial solution to this problem. This work presents a new strategy to detect content pollution that employs comparison-based diagnosis to identify modifications on the data stream. A peer compares randomly selected chunks received from its neighbors. Based on the comparison results, peers that transmitted polluted content are identified. The proposed solution was implemented using Fire-flies, a scalable and intrusion-tolerant overlay network. Experimental results show that the strategy represents a feasible solution to detect content pollution and causes a low overhead in terms of network bandwidth.","PeriodicalId":243515,"journal":{"name":"2013 Sixth Latin-American Symposium on Dependable Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128269953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Sojer, F. Reichenbach, Stein Erik Ellevseth, C. Buckl, A. Knoll
{"title":"A Model-Driven Approach for Runtime Reliability Analysis","authors":"D. Sojer, F. Reichenbach, Stein Erik Ellevseth, C. Buckl, A. Knoll","doi":"10.1109/LADC.2013.12","DOIUrl":"https://doi.org/10.1109/LADC.2013.12","url":null,"abstract":"Runtime reliability analysis has proven to be a valuable technique to enhance the overall reliability of safety-critical systems. It has the potential to close the dependability gap that has been identified by Laprie. However, existing approaches suffer from either too complex and therefore error-prone input languages or from long execution time due to the state space explosion of the underlying analysis techniques. In this paper, we present an approach for runtime reliability analysis, which handles both problems. It provides a compact metamodel that can be used to describe all necessary information. Moreover, it provides analysis algorithms that can be automatically parameterized by code generation. These algorithms are runtime efficient so that they can be executed even on low-end computers, e.g., safety-critical embedded systems, to adapt the system to changing environmental conditions.","PeriodicalId":243515,"journal":{"name":"2013 Sixth Latin-American Symposium on Dependable Computing","volume":"31 13","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113967922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Time Dimension in Predicting Failures: A Case Study","authors":"Ivano Irrera, C. Pereira, M. Vieira","doi":"10.1109/LADC.2013.25","DOIUrl":"https://doi.org/10.1109/LADC.2013.25","url":null,"abstract":"Online Failure Prediction is a cutting-edge technique for improving the dependability of software systems. It makes extensive use of machine learning techniques applied to variables monitored from the system at regular intervals of time (e.g. mutexes/s, paged bytes/s, etc.). The goal of this work is to assess the impact of considering the time dimension in failure prediction, through the use of sliding windows. The state-of-the-art SVM (Support Vector Machine) classifier is used to support the study, predicting failure events occurring in a Windows XP machine. An extensive comparative analysis is carried out, in particular using a software fault injection technique to speed up the failure data generation process.","PeriodicalId":243515,"journal":{"name":"2013 Sixth Latin-American Symposium on Dependable Computing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121357667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Cámara, R. Lemos, N. Laranjeiro, Rafael Ventura, M. Vieira
{"title":"Robustness Evaluation of Controllers in Self-Adaptive Software Systems","authors":"J. Cámara, R. Lemos, N. Laranjeiro, Rafael Ventura, M. Vieira","doi":"10.1109/LADC.2013.17","DOIUrl":"https://doi.org/10.1109/LADC.2013.17","url":null,"abstract":"An increasingly important requirement for software-intensive systems is the ability to self-manage by adapting their structure and behavior at run-time in an autonomous way as a response to a variety of changes that may occur to the system, its environment, or its goals. In particular, self-adaptive (or autonomic) systems incorporate complex software components that act as controllers of a target system by executing actions through effectors, based on information monitored by probes. However, although these controllers are becoming critical in many application domains, it is still difficult to assess their robustness. The proposed approach for evaluating the robustness of controllers for self-adaptive software systems, is aimed at the effective identification of design faults. To achieve this objective, our proposal is based on a set of robustness tests that include the provision of mutated inputs to the interfaces between the controller and the target system (i.e., probes). The feasibility of the approach is evaluated in the context of Znn.com, a case study implemented using the Rainbow framework for architecture-based self-adaptation.","PeriodicalId":243515,"journal":{"name":"2013 Sixth Latin-American Symposium on Dependable Computing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130345808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reliability Analysis of Software Architecture Evolution","authors":"J. M. Franco, R. Barbosa, M. Z. Rela","doi":"10.1109/LADC.2013.16","DOIUrl":"https://doi.org/10.1109/LADC.2013.16","url":null,"abstract":"Software engineers and practitioners regard software architecture as an important artifact, providing the means to model the structure and behavior of systems and to support early decisions on dependability and other quality attributes. Since systems are most often subject to evolution, the software architecture can be used as an early indicator on the impact of the planned evolution on quality attributes. We propose an automated approach to evaluate the impact on reliability of architecture evolution. Our approach provides relevant information for architects to predict the impact of component reliabilities, usage profile and system structure on the overall reliability. We translate a system's architectural description written in an Architecture Description Language (ADL) to a stochastic model suitable for performing a thorough analysis on the possible architectural modifications. We applied our method to a case study widely used in research in which we identified the reliability bottlenecks and performed structural modifications to obtain an improved architecture regarding its reliability.","PeriodicalId":243515,"journal":{"name":"2013 Sixth Latin-American Symposium on Dependable Computing","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133116116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}