{"title":"Graph-Based Dual-Agent Deep Reinforcement Learning for Dynamic Human–Machine Hybrid Reconfiguration Manufacturing Scheduling","authors":"Yuxin Li;Qihao Liu;Chunjiang Zhang;Xinyu Li;Liang Gao","doi":"10.1109/TSMC.2025.3612300","DOIUrl":null,"url":null,"abstract":"Human–machine hybrid reconfiguration manufacturing is an emerging paradigm in the field of precision equipment production and can greatly improve the production capability of the workshop. However, numerous complex constraints and a dynamic environment make reasonable scheduling very difficult. To this end, this article studies the dynamic human–machine hybrid reconfiguration manufacturing scheduling problem (DHMRSP) and proposes a novel deep reinforcement learning (DRL) scheduling method. Specifically, a dual-agent Markov decision process (MDP) is established, which can handle seven complex constraints and three disturbance events. Then, a heterogeneous competition graph attention network (HCGAN) is designed, where the meta-path-based subgraph conversion reflects the resource-operation competition, and three modules use node-level attention and semantic-level attention to realize important information embedding. Afterward, a dual proximal policy optimization (PPO) algorithm with HCGAN and mixed action space (HM-DPPO) is proposed, where the allocation agent and reconfiguration agent achieve collaborative learning by taking joint action and sharing graph embeddings and reward. Experimental results prove that the proposed approach outperforms rules, genetic programming (GP), and three DRL methods on different instances and can effectively handle various disturbance events.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 11","pages":"8729-8741"},"PeriodicalIF":8.7000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man Cybernetics-Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11180937/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Human–machine hybrid reconfiguration manufacturing is an emerging paradigm in the field of precision equipment production and can greatly improve the production capability of the workshop. However, numerous complex constraints and a dynamic environment make reasonable scheduling very difficult. To this end, this article studies the dynamic human–machine hybrid reconfiguration manufacturing scheduling problem (DHMRSP) and proposes a novel deep reinforcement learning (DRL) scheduling method. Specifically, a dual-agent Markov decision process (MDP) is established, which can handle seven complex constraints and three disturbance events. Then, a heterogeneous competition graph attention network (HCGAN) is designed, where the meta-path-based subgraph conversion reflects the resource-operation competition, and three modules use node-level attention and semantic-level attention to realize important information embedding. Afterward, a dual proximal policy optimization (PPO) algorithm with HCGAN and mixed action space (HM-DPPO) is proposed, where the allocation agent and reconfiguration agent achieve collaborative learning by taking joint action and sharing graph embeddings and reward. Experimental results prove that the proposed approach outperforms rules, genetic programming (GP), and three DRL methods on different instances and can effectively handle various disturbance events.
期刊介绍:
The IEEE Transactions on Systems, Man, and Cybernetics: Systems encompasses the fields of systems engineering, covering issue formulation, analysis, and modeling throughout the systems engineering lifecycle phases. It addresses decision-making, issue interpretation, systems management, processes, and various methods such as optimization, modeling, and simulation in the development and deployment of large systems.