{"title":"基于深度Q网络的制造资源调度","authors":"Yufei Zhang, Yuanhao Zou, Xiaodong Zhao","doi":"10.1051/wujns/2022276531","DOIUrl":null,"url":null,"abstract":"To optimize machine allocation and task dispatching in smart manufacturing factories, this paper proposes a manufacturing resource scheduling framework based on reinforcement learning (RL). The framework formulates the entire scheduling process as a multi-stage sequential decision problem, and further obtains the scheduling order by the combination of deep convolutional neural network (CNN) and improved deep Q-network (DQN). Specifically, with respect to the representation of the Markov decision process (MDP), the feature matrix is considered as the state space and a set of heuristic dispatching rules are denoted as the action space. In addition, the deep CNN is employed to approximate the state-action values, and the double dueling deep Q-network with prioritized experience replay and noisy network (D3QPN2) is adopted to determine the appropriate action according to the current state. In the experiments, compared with the traditional heuristic method, the proposed method is able to learn high-quality scheduling policy and achieve shorter makespan on the standard public datasets.","PeriodicalId":23976,"journal":{"name":"Wuhan University Journal of Natural Sciences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Manufacturing Resource Scheduling Based on Deep Q-Network\",\"authors\":\"Yufei Zhang, Yuanhao Zou, Xiaodong Zhao\",\"doi\":\"10.1051/wujns/2022276531\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To optimize machine allocation and task dispatching in smart manufacturing factories, this paper proposes a manufacturing resource scheduling framework based on reinforcement learning (RL). The framework formulates the entire scheduling process as a multi-stage sequential decision problem, and further obtains the scheduling order by the combination of deep convolutional neural network (CNN) and improved deep Q-network (DQN). Specifically, with respect to the representation of the Markov decision process (MDP), the feature matrix is considered as the state space and a set of heuristic dispatching rules are denoted as the action space. In addition, the deep CNN is employed to approximate the state-action values, and the double dueling deep Q-network with prioritized experience replay and noisy network (D3QPN2) is adopted to determine the appropriate action according to the current state. In the experiments, compared with the traditional heuristic method, the proposed method is able to learn high-quality scheduling policy and achieve shorter makespan on the standard public datasets.\",\"PeriodicalId\":23976,\"journal\":{\"name\":\"Wuhan University Journal of Natural Sciences\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Wuhan University Journal of Natural Sciences\",\"FirstCategoryId\":\"1093\",\"ListUrlMain\":\"https://doi.org/10.1051/wujns/2022276531\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Multidisciplinary\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wuhan University Journal of Natural Sciences","FirstCategoryId":"1093","ListUrlMain":"https://doi.org/10.1051/wujns/2022276531","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Multidisciplinary","Score":null,"Total":0}
Manufacturing Resource Scheduling Based on Deep Q-Network
To optimize machine allocation and task dispatching in smart manufacturing factories, this paper proposes a manufacturing resource scheduling framework based on reinforcement learning (RL). The framework formulates the entire scheduling process as a multi-stage sequential decision problem, and further obtains the scheduling order by the combination of deep convolutional neural network (CNN) and improved deep Q-network (DQN). Specifically, with respect to the representation of the Markov decision process (MDP), the feature matrix is considered as the state space and a set of heuristic dispatching rules are denoted as the action space. In addition, the deep CNN is employed to approximate the state-action values, and the double dueling deep Q-network with prioritized experience replay and noisy network (D3QPN2) is adopted to determine the appropriate action according to the current state. In the experiments, compared with the traditional heuristic method, the proposed method is able to learn high-quality scheduling policy and achieve shorter makespan on the standard public datasets.
期刊介绍:
Wuhan University Journal of Natural Sciences aims to promote rapid communication and exchange between the World and Wuhan University, as well as other Chinese universities and academic institutions. It mainly reflects the latest advances being made in many disciplines of scientific research in Chinese universities and academic institutions. The journal also publishes papers presented at conferences in China and abroad. The multi-disciplinary nature of Wuhan University Journal of Natural Sciences is apparent in the wide range of articles from leading Chinese scholars. This journal also aims to introduce Chinese academic achievements to the world community, by demonstrating the significance of Chinese scientific investigations.