{"title":"Efficient Task Allocation to FPGAs in the Safety Critical Domain","authors":"P. Conmy, I. Bate","doi":"10.1109/PRDC.2011.23","DOIUrl":"https://doi.org/10.1109/PRDC.2011.23","url":null,"abstract":"Field Programmable Gate Arrays (FPGAs) are highly configurable programmable logic devices. They offer many benefits over traditional micro-processors such as the ability to efficiently run tasks in parallel and also highly predictable timing performance. They are becoming increasingly popular for use in the safety critical domain where predictability is essential. However, concerns about their dependability, principally their reliability and difficulties in assessing the impact of an internal failure means that current designs are inefficient and conservative. This paper discusses these issues in depth. It also presents an FPGA task allocation method using simulated annealing to balance efficiency and reliability requirements. This can be used to improve designs of safety critical FPGA based systems.","PeriodicalId":254760,"journal":{"name":"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125913409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Failure Analysis of a Complex Learning Framework Incorporating Multi-modal and Semi-supervised Learning","authors":"L. Pullum, Christopher T. Symons","doi":"10.1109/PRDC.2011.52","DOIUrl":"https://doi.org/10.1109/PRDC.2011.52","url":null,"abstract":"Machine learning is used in many applications, from machine vision to speech recognition to decision support systems, and it is used to test applications. However, though much has been done to evaluate the performance of machine learning algorithms, little has been done to verify the algorithms or examine their failure modes. Moreover, complex learning frameworks often require stepping beyond black box evaluation to distinguish between errors based on natural limits on learning and errors that arise from mistakes in implementation. We present a conceptual architecture, failure model and taxonomy, and failure modes and effects analysis (FMEA) of a semi-supervised, multi-modal learning system, and provide specific examples from its use in a radiological analysis assistant system. The goal of the research described in this paper is to provide a foundation from which dependability analysis of systems using semi-supervised, multi-modal learning can be conducted. The methods presented provide a first step towards that overall goal.","PeriodicalId":254760,"journal":{"name":"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115532972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Resilient Virtual Clusters","authors":"Michael V. Le, I. Hsu, Y. Tamir","doi":"10.1109/PRDC.2011.33","DOIUrl":"https://doi.org/10.1109/PRDC.2011.33","url":null,"abstract":"Clusters of computers can provide, in aggregate, reliable services despite the failure of individual computers. System-level virtualization is widely used to consolidate the workload of multiple physical systems as multiple virtual machines (VMs) on a single physical computer. A single physical computer thus forms a fIvirtual clusterfP of VMs. A key difficulty with virtualization is that the failure of the virtualization infrastructure (VI) often leads to the failure of multiple VMs. This is likely to overload \"cluster computing\" resiliency mechanisms, typically designed to tolerate the failure of only a single node at a time. By supporting recovery from failure of key VI components, we have enhanced the resiliency of a VI (Xen), thus enabling the use of existing \"cluster computing\" techniques to provide resilient virtual clusters. In the overwhelming majority of cases, these enhancements allow recovery from errors in the VI to be accomplished without the failure of more than a single VM. The resulting resiliency of the virtual cluster is demonstrated by running two existing \"cluster computing\" systems while subjecting the VI to injected faults.","PeriodicalId":254760,"journal":{"name":"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128668791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Model Checking Multitask Applications for OSEK Compliant Real-Time Operating Systems","authors":"Mark L. McKelvin, E. Gamble, G. Holzmann","doi":"10.1109/PRDC.2011.49","DOIUrl":"https://doi.org/10.1109/PRDC.2011.49","url":null,"abstract":"In the verification of multitask software in real-time embedded systems, general purpose model checkers do not inherently consider characteristics of the real-time operating system, such as priority-based scheduling, priority inversion, and protocols for protecting shared memory resources. Since explicit state model checkers generally explore all possible execution paths and task interleaving, this could potentially lead to exploring execution paths that are redundant, unnecessarily increasing verification complexity and hampering tractability. Based on this premise, in this work we investigate how one can improve the performance of explicit state model checkers, such as SPIN, for the verification of multitask applications that target real-time operating systems.","PeriodicalId":254760,"journal":{"name":"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128705576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthias Güdemann, Michael Lipaczewski, F. Ortmeier
{"title":"Tool Supported Model-Based Safety Analysis and Optimization","authors":"Matthias Güdemann, Michael Lipaczewski, F. Ortmeier","doi":"10.1109/PRDC.2011.44","DOIUrl":"https://doi.org/10.1109/PRDC.2011.44","url":null,"abstract":"Although model-based approaches can yield very precises safety analysis, they are rarely used in practice. The reason is, that most techniques are very difficult to apply and almost always require separate models and tools. In this paper we present an outline for the integration of different model-based safety analysis and safety optimization methods into a single tool framework. We present the envisioned work-flow and some of the requirements for the tool integration. Because of its wide acceptance, platform independence and its well-documented API, we chose the Eclipse platform as framework foundation.","PeriodicalId":254760,"journal":{"name":"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129090533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Tankeu-Choitat, D. Navarre, Philippe A. Palanque, Y. Déléris, J. Fabre, Camille Fayollas
{"title":"Self-Checking Components for Dependable Interactive Cockpits Using Formal Description Techniques","authors":"A. Tankeu-Choitat, D. Navarre, Philippe A. Palanque, Y. Déléris, J. Fabre, Camille Fayollas","doi":"10.1109/PRDC.2011.28","DOIUrl":"https://doi.org/10.1109/PRDC.2011.28","url":null,"abstract":"In the last few years, glass cockpits are being replaced by interactive cockpits to provide a higher level of integration of both command and information display. Due to their event driven nature, interactive systems offer more display and control capabilities but they require specific error detection and fault tolerance techniques to reach a high level of dependability. This paper proposes a model-based approach for adding fault tolerance mechanisms to interactive cockpits. While several mechanisms are considered and presented, the contribution is focused on the formal description of self-checking widgets, being the basis for interactive cockpits.","PeriodicalId":254760,"journal":{"name":"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124629000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kishor S. Trivedi, R. Mansharamani, Dong Seong Kim, Michael Grottke, M. Nambiar
{"title":"Recovery from Failures Due to Mandelbugs in IT Systems","authors":"Kishor S. Trivedi, R. Mansharamani, Dong Seong Kim, Michael Grottke, M. Nambiar","doi":"10.1109/PRDC.2011.34","DOIUrl":"https://doi.org/10.1109/PRDC.2011.34","url":null,"abstract":"Several studies have been carried out on software bugs analysis and classification for life and mission critical systems, which include reproducible bugs called Bohrbugs, and hard to reproduce bugs called Mandelbugs. Although software reliability in IT systems has been studied for years, there are only a few formal analytic models for recovery from Mandelbugs. This paper discusses in detail several real cases of Mandelbugs and presents a simple flowchart which describes the recovery processes implemented in IT systems for a large variety of Mandelbugs. The flowchart is based on more than 10 IT systems that are running in production. The paper then presents a closed-form expression of the mean time to recovery from these bugs. Measures of interest including mean time to recovery and system unavailability are computed. A numerical and parametric sensitivity analysis of the model parameters are carried out. This analysis allows the designer to find out important parameter(s) for the recovery from failures due to Mandelbugs.","PeriodicalId":254760,"journal":{"name":"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114756612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Native Binary Mutation Analysis for Embedded Software and Virtual Prototypes in SystemC","authors":"C. Kuznik, W. Müller","doi":"10.1109/PRDC.2011.47","DOIUrl":"https://doi.org/10.1109/PRDC.2011.47","url":null,"abstract":"Mutation analysis is a powerful tool for white-box testing of the verification environment in order to produce dependable and higher quality software products. However, due to high computational costs and the focus on high-level software languages such as Java mutation analysis is not yet widely used in commercial design flows targeting embedded (software) systems. Here the industry is modeling both hardware and related software parts at higher levels of abstraction, called virtual prototypes, to accelerate parallel development and shorten time-to-market. In this paper we propose a mutation testing verification flow for SystemC based virtual prototypes that may not rely on source code only but on annotated basic blocks and enables mutant creation at assembler level to heavily reduce execution costs and equivalence mutants likelihood.","PeriodicalId":254760,"journal":{"name":"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127941114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jesus Friginal, D. Andrés, Juan-Carlos Ruiz-Garcia, Regina L. O. Moraes
{"title":"Using Dependability Benchmarks to Support ISO/IEC SQuaRE","authors":"Jesus Friginal, D. Andrés, Juan-Carlos Ruiz-Garcia, Regina L. O. Moraes","doi":"10.1109/PRDC.2011.13","DOIUrl":"https://doi.org/10.1109/PRDC.2011.13","url":null,"abstract":"The integration of Commercial-Off-The-Shelf (COTS) components in software has reduced time-to-market and production costs, but selecting the most suitable component, among those available, remains still a challenging task. This selection process, typically named benchmarking, requires evaluating the behaviour of eligible components in operation, and ranking them attending to quality characteristics. Most existing benchmarks only provide measures characterising the behaviour of software systems in absence of faults ignoring the hard impact that both accidental and malicious faults have on software quality. However, since using COTS to build a system may motivate the emergence of dependability issues due to the interaction between components, benchmarking the system in presence of faults is essential. The recent ISO/IEC 25045 standard copes with this lack by considering accidental faults when assessing the recoverability capabilities of software systems. This paper proposes a dependability benchmarking approach to determine the impact that faults (noted as disturbances in the standard) either accidental or malicious may have on the quality features exhibited by software components. As will be shown, the usefulness of the approach embraces all evaluator profiles (developers, acquirers and third-party evaluators) identified in the ISO/IEC 25000 \"SQuaRE\" standard. The feasibility of the proposal is finally illustrated through the benchmarking of three distinct software components, which implement the OLSR protocol specification, competing for integration in a wireless mesh network.","PeriodicalId":254760,"journal":{"name":"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117052300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dependability Improvement for Critical Systems","authors":"H. Hecht","doi":"10.1109/PRDC.2011.25","DOIUrl":"https://doi.org/10.1109/PRDC.2011.25","url":null,"abstract":"Control systems for airliners, military aircraft, automobiles, and for the safety of nuclear power plants are typical of the critical digital systems addressed in this paper. These systems are considered safe by the public: their accident rate is sufficiently low that it does not prevent their widespread acceptance. Nevertheless, developers, regulators and users would like to see further improvements in dependability. Accidents of scheduled air carriers are very rare, but when they do occur they are exhaustively investigated. The public record of these investigations is therefore a good starting point for exploring dependability improvement in critical systems. Examples presented in this paper show how current development practices permitted hazardous situations to exist and a methodology for reducing the frequency of such hazards is presented.","PeriodicalId":254760,"journal":{"name":"2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130817159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}