{"title":"探索基于 LSTM-PPO 的强化学习算法,以解决动态作业车间调度问题","authors":"Wei Chen, Zequn Zhang, Dunbing Tang, Changchun Liu, Yong Gui, Qingwei Nie, Zhen Zhao","doi":"10.1016/j.cie.2024.110633","DOIUrl":null,"url":null,"abstract":"<div><div>With the growth of personalized demand and the continuous improvement in social productivity, the large-scale and few-variety centralized production model is gradually transitioning towards a personalized model of small batches and multiple varieties, which makes the manufacturing process of the job shop increasingly complex. Furthermore, disruptive events such as machinery failures and rush orders in the job shop increase the uncertainty and variability of the production environment. Traditional scheduling methods are usually based on fixed rules and heuristic algorithms, which are difficult to adapt to constantly changing production environments and demands. This may lead to inaccurate scheduling decisions and hinder the optimal allocation of job shop resources. To solve the dynamic job shop scheduling problem (JSP) more effectively, this paper proposes a Reinforcement Learning (RL) optimization algorithm integrating long short-term memory (LSTM) neural network and proximal policy optimization (PPO). It can dynamically adjust scheduling strategies according to the changing production environment, achieving comprehensive status awareness of the job shop environment to make optimal scheduling decisions. First, a state-aware network framework based on LSTM-PPO is proposed to achieve real-time perception of job shop state changes. Then, the state and action space of the job shop are described within the context of the state-aware network framework. Finally, an experimental environment is established to verify the algorithm’s effectiveness. Training the LSTM-PPO algorithm makes it feasible to achieve better performance than other scheduling methods. By comparing the initial planning time with the actual completion time of the rescheduling decision under different dynamic disturbances, the efficiency of the proposed algorithm is verified for the dynamic JSP.</div></div>","PeriodicalId":55220,"journal":{"name":"Computers & Industrial Engineering","volume":"197 ","pages":"Article 110633"},"PeriodicalIF":6.7000,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Probing an LSTM-PPO-Based reinforcement learning algorithm to solve dynamic job shop scheduling problem\",\"authors\":\"Wei Chen, Zequn Zhang, Dunbing Tang, Changchun Liu, Yong Gui, Qingwei Nie, Zhen Zhao\",\"doi\":\"10.1016/j.cie.2024.110633\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the growth of personalized demand and the continuous improvement in social productivity, the large-scale and few-variety centralized production model is gradually transitioning towards a personalized model of small batches and multiple varieties, which makes the manufacturing process of the job shop increasingly complex. Furthermore, disruptive events such as machinery failures and rush orders in the job shop increase the uncertainty and variability of the production environment. Traditional scheduling methods are usually based on fixed rules and heuristic algorithms, which are difficult to adapt to constantly changing production environments and demands. This may lead to inaccurate scheduling decisions and hinder the optimal allocation of job shop resources. To solve the dynamic job shop scheduling problem (JSP) more effectively, this paper proposes a Reinforcement Learning (RL) optimization algorithm integrating long short-term memory (LSTM) neural network and proximal policy optimization (PPO). It can dynamically adjust scheduling strategies according to the changing production environment, achieving comprehensive status awareness of the job shop environment to make optimal scheduling decisions. First, a state-aware network framework based on LSTM-PPO is proposed to achieve real-time perception of job shop state changes. Then, the state and action space of the job shop are described within the context of the state-aware network framework. Finally, an experimental environment is established to verify the algorithm’s effectiveness. Training the LSTM-PPO algorithm makes it feasible to achieve better performance than other scheduling methods. By comparing the initial planning time with the actual completion time of the rescheduling decision under different dynamic disturbances, the efficiency of the proposed algorithm is verified for the dynamic JSP.</div></div>\",\"PeriodicalId\":55220,\"journal\":{\"name\":\"Computers & Industrial Engineering\",\"volume\":\"197 \",\"pages\":\"Article 110633\"},\"PeriodicalIF\":6.7000,\"publicationDate\":\"2024-10-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Industrial Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0360835224007551\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Industrial Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0360835224007551","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Probing an LSTM-PPO-Based reinforcement learning algorithm to solve dynamic job shop scheduling problem
With the growth of personalized demand and the continuous improvement in social productivity, the large-scale and few-variety centralized production model is gradually transitioning towards a personalized model of small batches and multiple varieties, which makes the manufacturing process of the job shop increasingly complex. Furthermore, disruptive events such as machinery failures and rush orders in the job shop increase the uncertainty and variability of the production environment. Traditional scheduling methods are usually based on fixed rules and heuristic algorithms, which are difficult to adapt to constantly changing production environments and demands. This may lead to inaccurate scheduling decisions and hinder the optimal allocation of job shop resources. To solve the dynamic job shop scheduling problem (JSP) more effectively, this paper proposes a Reinforcement Learning (RL) optimization algorithm integrating long short-term memory (LSTM) neural network and proximal policy optimization (PPO). It can dynamically adjust scheduling strategies according to the changing production environment, achieving comprehensive status awareness of the job shop environment to make optimal scheduling decisions. First, a state-aware network framework based on LSTM-PPO is proposed to achieve real-time perception of job shop state changes. Then, the state and action space of the job shop are described within the context of the state-aware network framework. Finally, an experimental environment is established to verify the algorithm’s effectiveness. Training the LSTM-PPO algorithm makes it feasible to achieve better performance than other scheduling methods. By comparing the initial planning time with the actual completion time of the rescheduling decision under different dynamic disturbances, the efficiency of the proposed algorithm is verified for the dynamic JSP.
期刊介绍:
Computers & Industrial Engineering (CAIE) is dedicated to researchers, educators, and practitioners in industrial engineering and related fields. Pioneering the integration of computers in research, education, and practice, industrial engineering has evolved to make computers and electronic communication integral to its domain. CAIE publishes original contributions focusing on the development of novel computerized methodologies to address industrial engineering problems. It also highlights the applications of these methodologies to issues within the broader industrial engineering and associated communities. The journal actively encourages submissions that push the boundaries of fundamental theories and concepts in industrial engineering techniques.