{"title":"An optimal joint maintenance and mission abort policy for a system executing multi-attempt missions","authors":"Sangqi Zhao , Yian Wei , Yang Li , Yao Cheng","doi":"10.1016/j.ress.2025.111667","DOIUrl":null,"url":null,"abstract":"<div><div>Mission-critical systems are subject to deterioration-induced failures that induce not only mission failure cost but also system failure penalty. Deciding whether and when to abort the mission is crucial for overall cost minimization. When a mission’s success is evaluated in terms of the cumulative execution time and can be achieved by multiple attempts, operators can implement maintenance to increase the mission success probability. This calls upon the need to decide the system maintenance timing together with mission abort decisions, which is challenging due to not only the complex multi-layer interactions between these two decision variables but also the large state and action spaces. In this paper, we develop a Markov decision process (MDP) framework to determine the optimal system maintenance and mission abort timing. First, we propose a joint maintenance and mission abort policy that enables the operator to include the impact of the maintenance cost into decision-making and implement system maintenance and mission abort throughout the mission execution process, which thereby outperforms existing alternatives in overall cost minimization. Second, we develop an MDP-based optimization framework and analytically obtain the structural properties of the optimal policy, including the existence of the state-dependent control limits for system maintenance and mission abort decisions and their interdependence. Third, we develop an enhanced value iteration algorithm that exploits the developed structural properties to significantly improve the computational efficiency over the standard approach. The advantages of the proposed policy and algorithm are demonstrated by a case study of a UAV performing a surveillance mission.</div></div>","PeriodicalId":54500,"journal":{"name":"Reliability Engineering & System Safety","volume":"266 ","pages":"Article 111667"},"PeriodicalIF":11.0000,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Reliability Engineering & System Safety","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0951832025008671","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
引用次数: 0
Abstract
Mission-critical systems are subject to deterioration-induced failures that induce not only mission failure cost but also system failure penalty. Deciding whether and when to abort the mission is crucial for overall cost minimization. When a mission’s success is evaluated in terms of the cumulative execution time and can be achieved by multiple attempts, operators can implement maintenance to increase the mission success probability. This calls upon the need to decide the system maintenance timing together with mission abort decisions, which is challenging due to not only the complex multi-layer interactions between these two decision variables but also the large state and action spaces. In this paper, we develop a Markov decision process (MDP) framework to determine the optimal system maintenance and mission abort timing. First, we propose a joint maintenance and mission abort policy that enables the operator to include the impact of the maintenance cost into decision-making and implement system maintenance and mission abort throughout the mission execution process, which thereby outperforms existing alternatives in overall cost minimization. Second, we develop an MDP-based optimization framework and analytically obtain the structural properties of the optimal policy, including the existence of the state-dependent control limits for system maintenance and mission abort decisions and their interdependence. Third, we develop an enhanced value iteration algorithm that exploits the developed structural properties to significantly improve the computational efficiency over the standard approach. The advantages of the proposed policy and algorithm are demonstrated by a case study of a UAV performing a surveillance mission.
期刊介绍:
Elsevier publishes Reliability Engineering & System Safety in association with the European Safety and Reliability Association and the Safety Engineering and Risk Analysis Division. The international journal is devoted to developing and applying methods to enhance the safety and reliability of complex technological systems, like nuclear power plants, chemical plants, hazardous waste facilities, space systems, offshore and maritime systems, transportation systems, constructed infrastructure, and manufacturing plants. The journal normally publishes only articles that involve the analysis of substantive problems related to the reliability of complex systems or present techniques and/or theoretical results that have a discernable relationship to the solution of such problems. An important aim is to balance academic material and practical applications.