S. Lang, Fabian Behrendt, Nico Lanzerath, T. Reggelin, Marcel Müller
{"title":"柔性作业车间生产实时调度的深度强化学习与离散事件仿真集成","authors":"S. Lang, Fabian Behrendt, Nico Lanzerath, T. Reggelin, Marcel Müller","doi":"10.1109/WSC48552.2020.9383997","DOIUrl":null,"url":null,"abstract":"The following paper presents the application of Deep Q-Networks (DQN) for solving a flexible job shop problem with integrated process planning. DQN is a deep reinforcement learning algorithm, which aims to train an agent to perform a specific task. In particular, we train two DQN agents in connection with a discrete-event simulation model of the problem, where one agent is responsible for the selection of operation sequences, while the other allocates jobs to machines. We compare the performance of DQN with the GRASP metaheuristic. After less than one hour of training, DQN generates schedules providing a lower makespan and total tardiness as the GRASP algorithm. Our first investigations reveal that DQN seems to generalize the training data to other problem cases. Once trained, the prediction and evaluation of new production schedules requires less than 0.2 seconds.","PeriodicalId":6692,"journal":{"name":"2020 Winter Simulation Conference (WSC)","volume":"142 1","pages":"3057-3068"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Integration of Deep Reinforcement Learning and Discrete-Event Simulation for Real-Time Scheduling of a Flexible Job Shop Production\",\"authors\":\"S. Lang, Fabian Behrendt, Nico Lanzerath, T. Reggelin, Marcel Müller\",\"doi\":\"10.1109/WSC48552.2020.9383997\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The following paper presents the application of Deep Q-Networks (DQN) for solving a flexible job shop problem with integrated process planning. DQN is a deep reinforcement learning algorithm, which aims to train an agent to perform a specific task. In particular, we train two DQN agents in connection with a discrete-event simulation model of the problem, where one agent is responsible for the selection of operation sequences, while the other allocates jobs to machines. We compare the performance of DQN with the GRASP metaheuristic. After less than one hour of training, DQN generates schedules providing a lower makespan and total tardiness as the GRASP algorithm. Our first investigations reveal that DQN seems to generalize the training data to other problem cases. Once trained, the prediction and evaluation of new production schedules requires less than 0.2 seconds.\",\"PeriodicalId\":6692,\"journal\":{\"name\":\"2020 Winter Simulation Conference (WSC)\",\"volume\":\"142 1\",\"pages\":\"3057-3068\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 Winter Simulation Conference (WSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WSC48552.2020.9383997\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Winter Simulation Conference (WSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WSC48552.2020.9383997","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Integration of Deep Reinforcement Learning and Discrete-Event Simulation for Real-Time Scheduling of a Flexible Job Shop Production
The following paper presents the application of Deep Q-Networks (DQN) for solving a flexible job shop problem with integrated process planning. DQN is a deep reinforcement learning algorithm, which aims to train an agent to perform a specific task. In particular, we train two DQN agents in connection with a discrete-event simulation model of the problem, where one agent is responsible for the selection of operation sequences, while the other allocates jobs to machines. We compare the performance of DQN with the GRASP metaheuristic. After less than one hour of training, DQN generates schedules providing a lower makespan and total tardiness as the GRASP algorithm. Our first investigations reveal that DQN seems to generalize the training data to other problem cases. Once trained, the prediction and evaluation of new production schedules requires less than 0.2 seconds.