{"title":"基于启发式Petri网驱动的深度强化学习增强机器人作业车间的实时调度","authors":"Sijia Yi , Jiliang Luo","doi":"10.1016/j.rcim.2025.103097","DOIUrl":null,"url":null,"abstract":"<div><div>In robotic job shops (RJS), a significant challenge lies in optimizing task allocation and robot routing simultaneously, especially since these tasks must be accomplished in real-time to efficiently manage unexpected situations, such as the urgent need for AGV recharging or sudden order additions. Deep reinforcement learning (DRL) shows promise for these complex scheduling tasks due to its ability to address problems characterized by substantial computational complexity. However, the rapid expansion of RJS state space and the difficulty of avoiding cyclic loops for AGVs pose significant challenges for DRL in realistic settings. To address these, we present a novel approach combining an artificial-potential-field (APF) with a deep Q-network (DQN) in a Petri net framework. The APF is designed for Petri nets to guide token movement toward goal place nodes. Throughout the learning process, the APF-guided mixed policy employs a cosine-annealing probability for APF policy and a piecewise linear probability for random policy. Initially, action selections predominantly rely on APF policy to efficiently gather high-reward experience. As training progresses, they shifts to more rely on the learned neural-network policy, with random exploration supplementing diversity, ensuring a robust transition from reward-driven exploration to precise decision-making. The APF-DQN method is tested in real-world RJS scenarios, showing superior exploration success and training efficiency over baseline DQN. It significantly outperforms both conventional dispatching rules and baseline DQN, reducing average makespan by over 55% compared to dispatching rules and by 14.9% relative to baseline DQN. This method significantly enhances traditional DQN by improving exploration success, learning efficiency, policy convergence, and adaptability to dynamic environments.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"97 ","pages":"Article 103097"},"PeriodicalIF":11.4000,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep reinforcement learning driven by heuristics with Petri nets for enhancing real-time scheduling in robotic job shops\",\"authors\":\"Sijia Yi , Jiliang Luo\",\"doi\":\"10.1016/j.rcim.2025.103097\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In robotic job shops (RJS), a significant challenge lies in optimizing task allocation and robot routing simultaneously, especially since these tasks must be accomplished in real-time to efficiently manage unexpected situations, such as the urgent need for AGV recharging or sudden order additions. Deep reinforcement learning (DRL) shows promise for these complex scheduling tasks due to its ability to address problems characterized by substantial computational complexity. However, the rapid expansion of RJS state space and the difficulty of avoiding cyclic loops for AGVs pose significant challenges for DRL in realistic settings. To address these, we present a novel approach combining an artificial-potential-field (APF) with a deep Q-network (DQN) in a Petri net framework. The APF is designed for Petri nets to guide token movement toward goal place nodes. Throughout the learning process, the APF-guided mixed policy employs a cosine-annealing probability for APF policy and a piecewise linear probability for random policy. Initially, action selections predominantly rely on APF policy to efficiently gather high-reward experience. As training progresses, they shifts to more rely on the learned neural-network policy, with random exploration supplementing diversity, ensuring a robust transition from reward-driven exploration to precise decision-making. The APF-DQN method is tested in real-world RJS scenarios, showing superior exploration success and training efficiency over baseline DQN. It significantly outperforms both conventional dispatching rules and baseline DQN, reducing average makespan by over 55% compared to dispatching rules and by 14.9% relative to baseline DQN. This method significantly enhances traditional DQN by improving exploration success, learning efficiency, policy convergence, and adaptability to dynamic environments.</div></div>\",\"PeriodicalId\":21452,\"journal\":{\"name\":\"Robotics and Computer-integrated Manufacturing\",\"volume\":\"97 \",\"pages\":\"Article 103097\"},\"PeriodicalIF\":11.4000,\"publicationDate\":\"2025-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Robotics and Computer-integrated Manufacturing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0736584525001516\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Computer-integrated Manufacturing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0736584525001516","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Deep reinforcement learning driven by heuristics with Petri nets for enhancing real-time scheduling in robotic job shops
In robotic job shops (RJS), a significant challenge lies in optimizing task allocation and robot routing simultaneously, especially since these tasks must be accomplished in real-time to efficiently manage unexpected situations, such as the urgent need for AGV recharging or sudden order additions. Deep reinforcement learning (DRL) shows promise for these complex scheduling tasks due to its ability to address problems characterized by substantial computational complexity. However, the rapid expansion of RJS state space and the difficulty of avoiding cyclic loops for AGVs pose significant challenges for DRL in realistic settings. To address these, we present a novel approach combining an artificial-potential-field (APF) with a deep Q-network (DQN) in a Petri net framework. The APF is designed for Petri nets to guide token movement toward goal place nodes. Throughout the learning process, the APF-guided mixed policy employs a cosine-annealing probability for APF policy and a piecewise linear probability for random policy. Initially, action selections predominantly rely on APF policy to efficiently gather high-reward experience. As training progresses, they shifts to more rely on the learned neural-network policy, with random exploration supplementing diversity, ensuring a robust transition from reward-driven exploration to precise decision-making. The APF-DQN method is tested in real-world RJS scenarios, showing superior exploration success and training efficiency over baseline DQN. It significantly outperforms both conventional dispatching rules and baseline DQN, reducing average makespan by over 55% compared to dispatching rules and by 14.9% relative to baseline DQN. This method significantly enhances traditional DQN by improving exploration success, learning efficiency, policy convergence, and adaptability to dynamic environments.
期刊介绍:
The journal, Robotics and Computer-Integrated Manufacturing, focuses on sharing research applications that contribute to the development of new or enhanced robotics, manufacturing technologies, and innovative manufacturing strategies that are relevant to industry. Papers that combine theory and experimental validation are preferred, while review papers on current robotics and manufacturing issues are also considered. However, papers on traditional machining processes, modeling and simulation, supply chain management, and resource optimization are generally not within the scope of the journal, as there are more appropriate journals for these topics. Similarly, papers that are overly theoretical or mathematical will be directed to other suitable journals. The journal welcomes original papers in areas such as industrial robotics, human-robot collaboration in manufacturing, cloud-based manufacturing, cyber-physical production systems, big data analytics in manufacturing, smart mechatronics, machine learning, adaptive and sustainable manufacturing, and other fields involving unique manufacturing technologies.