{"title":"Managing System Failure Risk: Performance Control and Mission Abort Decisions","authors":"Qingan Qiu","doi":"10.1109/ISSSR58837.2023.00018","DOIUrl":null,"url":null,"abstract":"The occurrence of failures in safety-critical systems can result in severe consequences, including loss of life and significant economic impact. Therefore, it is essential to establish effective risk control policies to enhance system survivability. While traditional approaches focus on preventive maintenance, which may be time-consuming and impractical during continuous mission execution, this research proposes an alternative approach. By leveraging the relationship between system performance levels and degradation behavior, opportunities arise for controlling system deterioration through dynamic performance adjustment. Mission abort is also explored as an intuitive way to mitigate safety hazards. To achieve flexible risk control during mission execution, this study dynamically adjusts performance levels and mission abort decisions based on the deterioration level and amount of remaining work. The problem is formulated within the framework of a Markov decision process, and optimal policies are derived by analyzing structural properties. Comparative evaluations of heuristic policies are conducted to provide insights, and it is demonstrated that optimal performance control and mission abort policies exhibit a threshold structure, dependent on the performance level and degradation process. The utilization of condition information for dynamic adjustments offers potential for reducing failure risks and operational costs in safety-critical systems.","PeriodicalId":185173,"journal":{"name":"2023 9th International Symposium on System Security, Safety, and Reliability (ISSSR)","volume":"126 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 9th International Symposium on System Security, Safety, and Reliability (ISSSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSSR58837.2023.00018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The occurrence of failures in safety-critical systems can result in severe consequences, including loss of life and significant economic impact. Therefore, it is essential to establish effective risk control policies to enhance system survivability. While traditional approaches focus on preventive maintenance, which may be time-consuming and impractical during continuous mission execution, this research proposes an alternative approach. By leveraging the relationship between system performance levels and degradation behavior, opportunities arise for controlling system deterioration through dynamic performance adjustment. Mission abort is also explored as an intuitive way to mitigate safety hazards. To achieve flexible risk control during mission execution, this study dynamically adjusts performance levels and mission abort decisions based on the deterioration level and amount of remaining work. The problem is formulated within the framework of a Markov decision process, and optimal policies are derived by analyzing structural properties. Comparative evaluations of heuristic policies are conducted to provide insights, and it is demonstrated that optimal performance control and mission abort policies exhibit a threshold structure, dependent on the performance level and degradation process. The utilization of condition information for dynamic adjustments offers potential for reducing failure risks and operational costs in safety-critical systems.