Lu Zhang , Yi Feng , Qinge Xiao , Yunlang Xu , Di Li , Dongsheng Yang , Zhile Yang
{"title":"Deep reinforcement learning for dynamic flexible job shop scheduling problem considering variable processing times","authors":"Lu Zhang , Yi Feng , Qinge Xiao , Yunlang Xu , Di Li , Dongsheng Yang , Zhile Yang","doi":"10.1016/j.jmsy.2023.09.009","DOIUrl":null,"url":null,"abstract":"<div><p><span><span>In recent years, the uncertainties and complexity in the production process, due to the boosted customized requirements, has dramatically increased the difficulties of Dynamic Flexible Job Shop Scheduling (DFJSP). This paper investigates a new DFJSP model taking into account the minimum completion time under the condition of machine processing time uncertainty, e.t. VPT-FJSP problem. In the formulated VPT-FJSP process, each workpiece needs to be processed by required machine at a certain time slot where Markov decision process (MDP) and reinforcement learning methods are adopted to solve VPT-FJSP. The agent designed in this paper employs the Proximal Policy Optimization(PPO) algorithm in deep reinforcement learning, which includes the Actor-Critic network. The input of the network is to extract the processing information matrix and to embed some advanced states in the workshop by graph neural network, which enables the agent to learn the complete state of the environment. Finally, we train and test the proposed framework on the canonical FJSP benchmark, and the experimental results show that our framework can make agent better than </span>genetic algorithm and </span>ant colony optimization in most cases, 94.29% of static scheduling. It is also shown superiority compared to the scheduling rules in dynamic environment and has demonstrated strong robustness in solving VPT-FJSP. Furthermore, this study conducted tests to assess the generalization capability of the agent on VPT-FJSP at different scales. In terms of exploring Makespan minimization, the agent outperformed four priority scheduling rules. These results indicate that the proposed dynamic scheduling framework and PPO algorithm are more effective in achieving superior solutions.</p></div>","PeriodicalId":16227,"journal":{"name":"Journal of Manufacturing Systems","volume":"71 ","pages":"Pages 257-273"},"PeriodicalIF":12.2000,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Manufacturing Systems","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0278612523001917","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, the uncertainties and complexity in the production process, due to the boosted customized requirements, has dramatically increased the difficulties of Dynamic Flexible Job Shop Scheduling (DFJSP). This paper investigates a new DFJSP model taking into account the minimum completion time under the condition of machine processing time uncertainty, e.t. VPT-FJSP problem. In the formulated VPT-FJSP process, each workpiece needs to be processed by required machine at a certain time slot where Markov decision process (MDP) and reinforcement learning methods are adopted to solve VPT-FJSP. The agent designed in this paper employs the Proximal Policy Optimization(PPO) algorithm in deep reinforcement learning, which includes the Actor-Critic network. The input of the network is to extract the processing information matrix and to embed some advanced states in the workshop by graph neural network, which enables the agent to learn the complete state of the environment. Finally, we train and test the proposed framework on the canonical FJSP benchmark, and the experimental results show that our framework can make agent better than genetic algorithm and ant colony optimization in most cases, 94.29% of static scheduling. It is also shown superiority compared to the scheduling rules in dynamic environment and has demonstrated strong robustness in solving VPT-FJSP. Furthermore, this study conducted tests to assess the generalization capability of the agent on VPT-FJSP at different scales. In terms of exploring Makespan minimization, the agent outperformed four priority scheduling rules. These results indicate that the proposed dynamic scheduling framework and PPO algorithm are more effective in achieving superior solutions.
期刊介绍:
The Journal of Manufacturing Systems is dedicated to showcasing cutting-edge fundamental and applied research in manufacturing at the systems level. Encompassing products, equipment, people, information, control, and support functions, manufacturing systems play a pivotal role in the economical and competitive development, production, delivery, and total lifecycle of products, meeting market and societal needs.
With a commitment to publishing archival scholarly literature, the journal strives to advance the state of the art in manufacturing systems and foster innovation in crafting efficient, robust, and sustainable manufacturing systems. The focus extends from equipment-level considerations to the broader scope of the extended enterprise. The Journal welcomes research addressing challenges across various scales, including nano, micro, and macro-scale manufacturing, and spanning diverse sectors such as aerospace, automotive, energy, and medical device manufacturing.