{"title":"A dynamic scheduling method with Conv-Dueling and generalized representation based on reinforcement learning","authors":"Minghao Xia, Haibin Liu, Mingfei Li, Long Wang","doi":"10.5267/j.ijiec.2023.6.003","DOIUrl":null,"url":null,"abstract":"In modern industrial manufacturing, there are uncertain dynamic disturbances between processing machines and jobs which will disrupt the original production plan. This research focuses on dynamic multi-objective flexible scheduling problems such as the multi-constraint relationship among machines, jobs, and uncertain disturbance events. The possible disturbance events include job insertion, machine breakdown, and processing time change. The paper proposes a conv-dueling network model, a multidimensional state representation of the job processing information, and multiple scheduling objectives for minimizing makespan and delay time, while maximizing the completion punctuality rate. We design a multidimensional state space that includes job and machine processing information, an efficient and complete intelligent agent scheduling action space, and a compound scheduling reward function that combines the main task and the branch task. The unsupervised training of the network model utilizes the dueling-double-deep Q-network (D3QN) algorithm. Finally, based on the multi-constraint and multi-disturbance production environment information, the multidimensional state representation matrix of the job is used as input and the optimal scheduling rules are output after the feature extraction of the conv-dueling network model and decision making. This study carries out simulation experiments on 50 test cases. The results show the proposed conv-dueling network model can quickly converge for DQN, DDQN, and D3QN algorithms, and has good stability and universality. The experimental results indicate that the scheduling algorithm proposed in this paper outperforms DQN, DDQN, and single scheduling algorithms in all three scheduling objectives. It also demonstrates high robustness and excellent comprehensive scheduling performance.","PeriodicalId":51356,"journal":{"name":"International Journal of Industrial Engineering Computations","volume":"14 1","pages":"0"},"PeriodicalIF":1.6000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Industrial Engineering Computations","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5267/j.ijiec.2023.6.003","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
引用次数: 1
Abstract
In modern industrial manufacturing, there are uncertain dynamic disturbances between processing machines and jobs which will disrupt the original production plan. This research focuses on dynamic multi-objective flexible scheduling problems such as the multi-constraint relationship among machines, jobs, and uncertain disturbance events. The possible disturbance events include job insertion, machine breakdown, and processing time change. The paper proposes a conv-dueling network model, a multidimensional state representation of the job processing information, and multiple scheduling objectives for minimizing makespan and delay time, while maximizing the completion punctuality rate. We design a multidimensional state space that includes job and machine processing information, an efficient and complete intelligent agent scheduling action space, and a compound scheduling reward function that combines the main task and the branch task. The unsupervised training of the network model utilizes the dueling-double-deep Q-network (D3QN) algorithm. Finally, based on the multi-constraint and multi-disturbance production environment information, the multidimensional state representation matrix of the job is used as input and the optimal scheduling rules are output after the feature extraction of the conv-dueling network model and decision making. This study carries out simulation experiments on 50 test cases. The results show the proposed conv-dueling network model can quickly converge for DQN, DDQN, and D3QN algorithms, and has good stability and universality. The experimental results indicate that the scheduling algorithm proposed in this paper outperforms DQN, DDQN, and single scheduling algorithms in all three scheduling objectives. It also demonstrates high robustness and excellent comprehensive scheduling performance.