Yong Gui , Dunbing Tang , Yuqian Lu , Haihua Zhu , Zequn Zhang , Changchun Liu
{"title":"基于有效样本的多智能体强化学习对自组织生产执行中机器故障的实时响应","authors":"Yong Gui , Dunbing Tang , Yuqian Lu , Haihua Zhu , Zequn Zhang , Changchun Liu","doi":"10.1016/j.rcim.2025.103038","DOIUrl":null,"url":null,"abstract":"<div><div>With the growing demand for personalized production, multi-agent technology has been introduced to facilitate rapid self-organizing production execution. The application of communication protocols and dynamic scheduling algorithms supports multi-agent negotiation and real-time scheduling decisions in response to conventional production events. To address machine failures, real-time response strategies have been developed to manage jobs affected by the disruptions. However, the performance of existing strategies varies significantly depending on the real-time production state. In this paper, we propose a real-time response strategy using multi-agent reinforcement learning (MARL) that provides an appropriate response strategy for each job affected by machine failures, considering the real-time production state. Specifically, we establish a self-organizing production execution process with machine failures to specify the real-time response problem. Subsequently, a Markov game involving multiple buffer agents is constructed, transforming the real-time response problem into a MARL task. Furthermore, a continuous variable ranging from 0 to 1 is defined as the action space for each buffer agent, allowing it to select a response strategy for each affected job. Finally, a modified multi-agent deep deterministic policy gradient (MADDPG) algorithm is introduced, leveraging effective samples to train buffer agents at each failure moment. This enables the selection of an optimal response strategy for each affected job. Experimental results indicate that the proposed real-time response strategy outperforms both existing response strategies and the original MADDPG-based strategy across 54 distinct production configurations.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"95 ","pages":"Article 103038"},"PeriodicalIF":9.1000,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Real-time response to machine failures in self-organizing production execution using multi-agent reinforcement learning with effective samples\",\"authors\":\"Yong Gui , Dunbing Tang , Yuqian Lu , Haihua Zhu , Zequn Zhang , Changchun Liu\",\"doi\":\"10.1016/j.rcim.2025.103038\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the growing demand for personalized production, multi-agent technology has been introduced to facilitate rapid self-organizing production execution. The application of communication protocols and dynamic scheduling algorithms supports multi-agent negotiation and real-time scheduling decisions in response to conventional production events. To address machine failures, real-time response strategies have been developed to manage jobs affected by the disruptions. However, the performance of existing strategies varies significantly depending on the real-time production state. In this paper, we propose a real-time response strategy using multi-agent reinforcement learning (MARL) that provides an appropriate response strategy for each job affected by machine failures, considering the real-time production state. Specifically, we establish a self-organizing production execution process with machine failures to specify the real-time response problem. Subsequently, a Markov game involving multiple buffer agents is constructed, transforming the real-time response problem into a MARL task. Furthermore, a continuous variable ranging from 0 to 1 is defined as the action space for each buffer agent, allowing it to select a response strategy for each affected job. Finally, a modified multi-agent deep deterministic policy gradient (MADDPG) algorithm is introduced, leveraging effective samples to train buffer agents at each failure moment. This enables the selection of an optimal response strategy for each affected job. Experimental results indicate that the proposed real-time response strategy outperforms both existing response strategies and the original MADDPG-based strategy across 54 distinct production configurations.</div></div>\",\"PeriodicalId\":21452,\"journal\":{\"name\":\"Robotics and Computer-integrated Manufacturing\",\"volume\":\"95 \",\"pages\":\"Article 103038\"},\"PeriodicalIF\":9.1000,\"publicationDate\":\"2025-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Robotics and Computer-integrated Manufacturing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0736584525000924\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Computer-integrated Manufacturing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0736584525000924","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Real-time response to machine failures in self-organizing production execution using multi-agent reinforcement learning with effective samples
With the growing demand for personalized production, multi-agent technology has been introduced to facilitate rapid self-organizing production execution. The application of communication protocols and dynamic scheduling algorithms supports multi-agent negotiation and real-time scheduling decisions in response to conventional production events. To address machine failures, real-time response strategies have been developed to manage jobs affected by the disruptions. However, the performance of existing strategies varies significantly depending on the real-time production state. In this paper, we propose a real-time response strategy using multi-agent reinforcement learning (MARL) that provides an appropriate response strategy for each job affected by machine failures, considering the real-time production state. Specifically, we establish a self-organizing production execution process with machine failures to specify the real-time response problem. Subsequently, a Markov game involving multiple buffer agents is constructed, transforming the real-time response problem into a MARL task. Furthermore, a continuous variable ranging from 0 to 1 is defined as the action space for each buffer agent, allowing it to select a response strategy for each affected job. Finally, a modified multi-agent deep deterministic policy gradient (MADDPG) algorithm is introduced, leveraging effective samples to train buffer agents at each failure moment. This enables the selection of an optimal response strategy for each affected job. Experimental results indicate that the proposed real-time response strategy outperforms both existing response strategies and the original MADDPG-based strategy across 54 distinct production configurations.
期刊介绍:
The journal, Robotics and Computer-Integrated Manufacturing, focuses on sharing research applications that contribute to the development of new or enhanced robotics, manufacturing technologies, and innovative manufacturing strategies that are relevant to industry. Papers that combine theory and experimental validation are preferred, while review papers on current robotics and manufacturing issues are also considered. However, papers on traditional machining processes, modeling and simulation, supply chain management, and resource optimization are generally not within the scope of the journal, as there are more appropriate journals for these topics. Similarly, papers that are overly theoretical or mathematical will be directed to other suitable journals. The journal welcomes original papers in areas such as industrial robotics, human-robot collaboration in manufacturing, cloud-based manufacturing, cyber-physical production systems, big data analytics in manufacturing, smart mechatronics, machine learning, adaptive and sustainable manufacturing, and other fields involving unique manufacturing technologies.