{"title":"A modified multi-agent proximal policy optimization algorithm for multi-objective dynamic partial-re-entrant hybrid flow shop scheduling problem","authors":"Jiawei Wu, Yong Liu","doi":"10.1016/j.engappai.2024.109688","DOIUrl":null,"url":null,"abstract":"<div><div>This paper extends a novel model for modern flexible manufacturing systems: the multi-objective dynamic partial-re-entrant hybrid flow shop scheduling problem (MDPR-HFSP). The model considers partial-re-entrant processing, dynamic disturbance events, green manufacturing demand, and machine workload. Despite advancements in applying deep reinforcement learning to dynamic workshop scheduling, current methods face challenges in training scheduling policies for partial-re-entrant processing constraints and multiple manufacturing objectives. To solve the MDPR-HFSP, we propose a modified multi-agent proximal policy optimization (MMAPPO) algorithm, which employs a routing agent (RA) for machine assignment and a sequencing agent (SA) for job selection. Four machine assignment rules and four job selection rules are integrated to choose optimum actions for RA and SA at rescheduling points. In addition, reward signals are created by combining objective weight vectors with reward vectors, and training parameters under each weight vector are saved to flexibly optimize three objectives. Furthermore, we design an adaptive trust region clipping method to improve the constraint of the proximal policy optimization algorithm on the differences between new and old policies by introducing the Wasserstein distance. Moreover, we conduct comprehensive numerical experiments to compare the proposed MMAPPO algorithm with nine composite scheduling rules and the basic multi-agent proximal policy optimization algorithm. The results demonstrate that the proposed MMAPPO algorithm is more effective in solving the MDPR-HFSP and achieves superior convergence and diversity in solutions. Finally, a semiconductor wafer manufacturing case is resolved by the MMAPPO, and the scheduling solution meets the responsive requirement.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109688"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624018463","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper extends a novel model for modern flexible manufacturing systems: the multi-objective dynamic partial-re-entrant hybrid flow shop scheduling problem (MDPR-HFSP). The model considers partial-re-entrant processing, dynamic disturbance events, green manufacturing demand, and machine workload. Despite advancements in applying deep reinforcement learning to dynamic workshop scheduling, current methods face challenges in training scheduling policies for partial-re-entrant processing constraints and multiple manufacturing objectives. To solve the MDPR-HFSP, we propose a modified multi-agent proximal policy optimization (MMAPPO) algorithm, which employs a routing agent (RA) for machine assignment and a sequencing agent (SA) for job selection. Four machine assignment rules and four job selection rules are integrated to choose optimum actions for RA and SA at rescheduling points. In addition, reward signals are created by combining objective weight vectors with reward vectors, and training parameters under each weight vector are saved to flexibly optimize three objectives. Furthermore, we design an adaptive trust region clipping method to improve the constraint of the proximal policy optimization algorithm on the differences between new and old policies by introducing the Wasserstein distance. Moreover, we conduct comprehensive numerical experiments to compare the proposed MMAPPO algorithm with nine composite scheduling rules and the basic multi-agent proximal policy optimization algorithm. The results demonstrate that the proposed MMAPPO algorithm is more effective in solving the MDPR-HFSP and achieves superior convergence and diversity in solutions. Finally, a semiconductor wafer manufacturing case is resolved by the MMAPPO, and the scheduling solution meets the responsive requirement.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.