考虑备件供应不确定性的维修与备件订购联合优化的深度强化学习

IF 11 1区工程技术 Q1 ENGINEERING, INDUSTRIAL

Reliability Engineering & System Safety Pub Date : 2025-06-22 DOI:10.1016/j.ress.2025.111385

Yunxin Zhu , Meimei Zheng , Zhiyun Su , Tangbin Xia , Jie Lin , Ershun Pan

{"title":"考虑备件供应不确定性的维修与备件订购联合优化的深度强化学习","authors":"Yunxin Zhu , Meimei Zheng , Zhiyun Su , Tangbin Xia , Jie Lin , Ershun Pan","doi":"10.1016/j.ress.2025.111385","DOIUrl":null,"url":null,"abstract":"<div><div>Efficient maintenance and spare parts ordering strategies can reduce costs for manufacturing companies. In recent years, important components may suffer supply risks due to geopolitical conflicts, trade conflicts, and limitations of key resources. This paper investigates the joint optimization of condition-based maintenance and dual sourcing of spare parts from reliable and unreliable suppliers. We formulate this joint decision problem with a Markov decision process and design a value iteration algorithm to obtain exact solutions for the optimal maintenance and ordering policy. However, the value iteration algorithm is not suitable for solving large-scale problems due to its long running time. Thus, we develop a deep Q-network (DQN) algorithm based on deep reinforcement learning to improve computation efficiency. Numerical experiments are conducted to validate the effectiveness of the DQN algorithm. The results show that the DQN algorithm can reduce the running time by 92.58 % for systems with more than 4 components and more than 5 states within a 4.82 % cost gap compared to the value iteration algorithm. Compared to the separate heuristic policy, the DQN algorithm can averagely reduce the cost by 11.27 %.</div></div>","PeriodicalId":54500,"journal":{"name":"Reliability Engineering & System Safety","volume":"264 ","pages":"Article 111385"},"PeriodicalIF":11.0000,"publicationDate":"2025-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep reinforcement learning for joint optimization of maintenance and spare parts ordering considering spare parts supply uncertainty\",\"authors\":\"Yunxin Zhu , Meimei Zheng , Zhiyun Su , Tangbin Xia , Jie Lin , Ershun Pan\",\"doi\":\"10.1016/j.ress.2025.111385\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Efficient maintenance and spare parts ordering strategies can reduce costs for manufacturing companies. In recent years, important components may suffer supply risks due to geopolitical conflicts, trade conflicts, and limitations of key resources. This paper investigates the joint optimization of condition-based maintenance and dual sourcing of spare parts from reliable and unreliable suppliers. We formulate this joint decision problem with a Markov decision process and design a value iteration algorithm to obtain exact solutions for the optimal maintenance and ordering policy. However, the value iteration algorithm is not suitable for solving large-scale problems due to its long running time. Thus, we develop a deep Q-network (DQN) algorithm based on deep reinforcement learning to improve computation efficiency. Numerical experiments are conducted to validate the effectiveness of the DQN algorithm. The results show that the DQN algorithm can reduce the running time by 92.58 % for systems with more than 4 components and more than 5 states within a 4.82 % cost gap compared to the value iteration algorithm. Compared to the separate heuristic policy, the DQN algorithm can averagely reduce the cost by 11.27 %.</div></div>\",\"PeriodicalId\":54500,\"journal\":{\"name\":\"Reliability Engineering & System Safety\",\"volume\":\"264 \",\"pages\":\"Article 111385\"},\"PeriodicalIF\":11.0000,\"publicationDate\":\"2025-06-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Reliability Engineering & System Safety\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0951832025005861\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, INDUSTRIAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Reliability Engineering & System Safety","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0951832025005861","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}

引用次数: 0

摘要

有效的维护和备件订购策略可以降低制造企业的成本。近年来，由于地缘政治冲突、贸易冲突和关键资源的限制，重要零部件可能面临供应风险。本文研究了基于状态的维修和从可靠和不可靠供应商处双重采购备件的联合优化问题。我们将该联合决策问题与马尔可夫决策过程结合起来，设计了一种值迭代算法来获得最优维护和排序策略的精确解。但是，数值迭代算法运行时间长，不适合求解大规模问题。因此，我们开发了一种基于深度强化学习的深度Q-network （DQN）算法，以提高计算效率。通过数值实验验证了DQN算法的有效性。结果表明，与值迭代算法相比，DQN算法在4个以上组件、5个以上状态的系统中，运行时间缩短了92.58%，成本差距为4.82%。与单独启发式策略相比，DQN算法平均可降低11.27%的成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep reinforcement learning for joint optimization of maintenance and spare parts ordering considering spare parts supply uncertainty

Efficient maintenance and spare parts ordering strategies can reduce costs for manufacturing companies. In recent years, important components may suffer supply risks due to geopolitical conflicts, trade conflicts, and limitations of key resources. This paper investigates the joint optimization of condition-based maintenance and dual sourcing of spare parts from reliable and unreliable suppliers. We formulate this joint decision problem with a Markov decision process and design a value iteration algorithm to obtain exact solutions for the optimal maintenance and ordering policy. However, the value iteration algorithm is not suitable for solving large-scale problems due to its long running time. Thus, we develop a deep Q-network (DQN) algorithm based on deep reinforcement learning to improve computation efficiency. Numerical experiments are conducted to validate the effectiveness of the DQN algorithm. The results show that the DQN algorithm can reduce the running time by 92.58 % for systems with more than 4 components and more than 5 states within a 4.82 % cost gap compared to the value iteration algorithm. Compared to the separate heuristic policy, the DQN algorithm can averagely reduce the cost by 11.27 %.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Reliability Engineering & System Safety 管理科学-工程：工业

CiteScore

15.20

自引率

39.50%

发文量

621

审稿时长

67 days

期刊介绍： Elsevier publishes Reliability Engineering & System Safety in association with the European Safety and Reliability Association and the Safety Engineering and Risk Analysis Division. The international journal is devoted to developing and applying methods to enhance the safety and reliability of complex technological systems, like nuclear power plants, chemical plants, hazardous waste facilities, space systems, offshore and maritime systems, transportation systems, constructed infrastructure, and manufacturing plants. The journal normally publishes only articles that involve the analysis of substantive problems related to the reliability of complex systems or present techniques and/or theoretical results that have a discernable relationship to the solution of such problems. An important aim is to balance academic material and practical applications.