针对多目标动态部分重入混合流车间调度问题的改进型多代理近端策略优化算法

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence Pub Date : 2024-11-27 DOI:10.1016/j.engappai.2024.109688

Jiawei Wu, Yong Liu

{"title":"针对多目标动态部分重入混合流车间调度问题的改进型多代理近端策略优化算法","authors":"Jiawei Wu, Yong Liu","doi":"10.1016/j.engappai.2024.109688","DOIUrl":null,"url":null,"abstract":"<div><div>This paper extends a novel model for modern flexible manufacturing systems: the multi-objective dynamic partial-re-entrant hybrid flow shop scheduling problem (MDPR-HFSP). The model considers partial-re-entrant processing, dynamic disturbance events, green manufacturing demand, and machine workload. Despite advancements in applying deep reinforcement learning to dynamic workshop scheduling, current methods face challenges in training scheduling policies for partial-re-entrant processing constraints and multiple manufacturing objectives. To solve the MDPR-HFSP, we propose a modified multi-agent proximal policy optimization (MMAPPO) algorithm, which employs a routing agent (RA) for machine assignment and a sequencing agent (SA) for job selection. Four machine assignment rules and four job selection rules are integrated to choose optimum actions for RA and SA at rescheduling points. In addition, reward signals are created by combining objective weight vectors with reward vectors, and training parameters under each weight vector are saved to flexibly optimize three objectives. Furthermore, we design an adaptive trust region clipping method to improve the constraint of the proximal policy optimization algorithm on the differences between new and old policies by introducing the Wasserstein distance. Moreover, we conduct comprehensive numerical experiments to compare the proposed MMAPPO algorithm with nine composite scheduling rules and the basic multi-agent proximal policy optimization algorithm. The results demonstrate that the proposed MMAPPO algorithm is more effective in solving the MDPR-HFSP and achieves superior convergence and diversity in solutions. Finally, a semiconductor wafer manufacturing case is resolved by the MMAPPO, and the scheduling solution meets the responsive requirement.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109688"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A modified multi-agent proximal policy optimization algorithm for multi-objective dynamic partial-re-entrant hybrid flow shop scheduling problem\",\"authors\":\"Jiawei Wu, Yong Liu\",\"doi\":\"10.1016/j.engappai.2024.109688\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper extends a novel model for modern flexible manufacturing systems: the multi-objective dynamic partial-re-entrant hybrid flow shop scheduling problem (MDPR-HFSP). The model considers partial-re-entrant processing, dynamic disturbance events, green manufacturing demand, and machine workload. Despite advancements in applying deep reinforcement learning to dynamic workshop scheduling, current methods face challenges in training scheduling policies for partial-re-entrant processing constraints and multiple manufacturing objectives. To solve the MDPR-HFSP, we propose a modified multi-agent proximal policy optimization (MMAPPO) algorithm, which employs a routing agent (RA) for machine assignment and a sequencing agent (SA) for job selection. Four machine assignment rules and four job selection rules are integrated to choose optimum actions for RA and SA at rescheduling points. In addition, reward signals are created by combining objective weight vectors with reward vectors, and training parameters under each weight vector are saved to flexibly optimize three objectives. Furthermore, we design an adaptive trust region clipping method to improve the constraint of the proximal policy optimization algorithm on the differences between new and old policies by introducing the Wasserstein distance. Moreover, we conduct comprehensive numerical experiments to compare the proposed MMAPPO algorithm with nine composite scheduling rules and the basic multi-agent proximal policy optimization algorithm. The results demonstrate that the proposed MMAPPO algorithm is more effective in solving the MDPR-HFSP and achieves superior convergence and diversity in solutions. Finally, a semiconductor wafer manufacturing case is resolved by the MMAPPO, and the scheduling solution meets the responsive requirement.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"140 \",\"pages\":\"Article 109688\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-11-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197624018463\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624018463","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

本文扩展了现代柔性制造系统的一个新模型：多目标动态部分再入站混合流程车间调度问题（MDPR-HFSP）。该模型考虑了部分重入加工、动态干扰事件、绿色制造需求和机器工作量。尽管在将深度强化学习应用于动态车间调度方面取得了进展，但目前的方法在针对部分再入加工约束和多重制造目标训练调度策略方面仍面临挑战。为解决 MDPR-HFSP 问题，我们提出了一种改进的多代理近端策略优化（MMAPPO）算法，该算法采用路由代理（RA）进行机器分配，采用排序代理（SA）进行作业选择。四种机器分配规则和四种作业选择规则被整合在一起，为 RA 和 SA 在重新安排点选择最佳行动。此外，通过将目标权重向量与奖励向量相结合来创建奖励信号，并保存每个权重向量下的训练参数，从而灵活优化三个目标。此外，我们还设计了一种自适应信任区域剪切方法，通过引入瓦瑟斯坦距离来改善近似策略优化算法对新旧策略差异的约束。此外，我们还进行了全面的数值实验，将提出的 MMAPPO 算法与九种复合调度规则和基本的多代理近端策略优化算法进行了比较。结果表明，所提出的 MMAPPO 算法在求解 MDPR-HFSP 时更为有效，并实现了更优越的收敛性和解的多样性。最后，MMAPPO 解决了一个半导体晶圆制造案例，其调度方案符合响应要求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A modified multi-agent proximal policy optimization algorithm for multi-objective dynamic partial-re-entrant hybrid flow shop scheduling problem

This paper extends a novel model for modern flexible manufacturing systems: the multi-objective dynamic partial-re-entrant hybrid flow shop scheduling problem (MDPR-HFSP). The model considers partial-re-entrant processing, dynamic disturbance events, green manufacturing demand, and machine workload. Despite advancements in applying deep reinforcement learning to dynamic workshop scheduling, current methods face challenges in training scheduling policies for partial-re-entrant processing constraints and multiple manufacturing objectives. To solve the MDPR-HFSP, we propose a modified multi-agent proximal policy optimization (MMAPPO) algorithm, which employs a routing agent (RA) for machine assignment and a sequencing agent (SA) for job selection. Four machine assignment rules and four job selection rules are integrated to choose optimum actions for RA and SA at rescheduling points. In addition, reward signals are created by combining objective weight vectors with reward vectors, and training parameters under each weight vector are saved to flexibly optimize three objectives. Furthermore, we design an adaptive trust region clipping method to improve the constraint of the proximal policy optimization algorithm on the differences between new and old policies by introducing the Wasserstein distance. Moreover, we conduct comprehensive numerical experiments to compare the proposed MMAPPO algorithm with nine composite scheduling rules and the basic multi-agent proximal policy optimization algorithm. The results demonstrate that the proposed MMAPPO algorithm is more effective in solving the MDPR-HFSP and achieves superior convergence and diversity in solutions. Finally, a semiconductor wafer manufacturing case is resolved by the MMAPPO, and the scheduling solution meets the responsive requirement.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.