A deep reinforcement learning approach with graph attention network and multi-signal differential reward for dynamic hybrid flow shop scheduling problem

IF 12.2 1区 工程技术 Q1 ENGINEERING, INDUSTRIAL
Youshan Liu, Jiaxin Fan, Weiming Shen
{"title":"A deep reinforcement learning approach with graph attention network and multi-signal differential reward for dynamic hybrid flow shop scheduling problem","authors":"Youshan Liu,&nbsp;Jiaxin Fan,&nbsp;Weiming Shen","doi":"10.1016/j.jmsy.2025.03.028","DOIUrl":null,"url":null,"abstract":"<div><div>In real-life manufacturing systems, production management often faces uncertainty due to urgent demands and dynamic job insertions. Such uncertain environments pose significant challenges for scheduling, particularly in minimizing delivery delays and improving overall efficiency. Deep reinforcement learning (DRL) brings potential for rapid real-time production decisions, but scheduling in these environments with the objective of reducing delivery delays remains a challenging problem. This paper investigates a hybrid flow-shop dynamic scheduling problem with job insertions for minimizing the total weighted tardiness (TWT). An end-to-end DRL based method, the proximal policy optimization with graph attention network (PPO-GAT), is proposed to address the problem. First, a multi-agent system is established to simulate the actual manufacturing system and serve as a foundation for implementing intelligent production scheduling. Then, a novel graph-based state representation is developed to observe instantaneous states for the hybrid flow-shop. Two graph models are designed to represent system features and job features, and are extracted and fused by graph attention networks (GAT) to form the global feature. Afterwards, a multi-signal differential reward (MSDR) function is designed to address the intractable reward sparsity caused by the TWT objective. Finally, ablation experiments are conducted to validate all the proposed algorithmic components, and the PPO-GAT is compared with benchmark methods. Experimental results demonstrate the superiority of the proposed GAT, MSDR, and PPO-GAT. Moreover, the PPO-GAT has been proven to make real-time scheduling decisions for hybrid flow-shops with any scale, which can be considered as a promising solution for extensive industrial applications.</div></div>","PeriodicalId":16227,"journal":{"name":"Journal of Manufacturing Systems","volume":"80 ","pages":"Pages 643-661"},"PeriodicalIF":12.2000,"publicationDate":"2025-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Manufacturing Systems","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0278612525000883","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
引用次数: 0

Abstract

In real-life manufacturing systems, production management often faces uncertainty due to urgent demands and dynamic job insertions. Such uncertain environments pose significant challenges for scheduling, particularly in minimizing delivery delays and improving overall efficiency. Deep reinforcement learning (DRL) brings potential for rapid real-time production decisions, but scheduling in these environments with the objective of reducing delivery delays remains a challenging problem. This paper investigates a hybrid flow-shop dynamic scheduling problem with job insertions for minimizing the total weighted tardiness (TWT). An end-to-end DRL based method, the proximal policy optimization with graph attention network (PPO-GAT), is proposed to address the problem. First, a multi-agent system is established to simulate the actual manufacturing system and serve as a foundation for implementing intelligent production scheduling. Then, a novel graph-based state representation is developed to observe instantaneous states for the hybrid flow-shop. Two graph models are designed to represent system features and job features, and are extracted and fused by graph attention networks (GAT) to form the global feature. Afterwards, a multi-signal differential reward (MSDR) function is designed to address the intractable reward sparsity caused by the TWT objective. Finally, ablation experiments are conducted to validate all the proposed algorithmic components, and the PPO-GAT is compared with benchmark methods. Experimental results demonstrate the superiority of the proposed GAT, MSDR, and PPO-GAT. Moreover, the PPO-GAT has been proven to make real-time scheduling decisions for hybrid flow-shops with any scale, which can be considered as a promising solution for extensive industrial applications.
基于图注意网络和多信号差分奖励的深度强化学习方法研究动态混合流水车间调度问题
在现实的制造系统中,生产管理经常面临由于紧急需求和动态的工作插入而产生的不确定性。这种不确定的环境对调度提出了重大挑战,特别是在最小化交付延迟和提高整体效率方面。深度强化学习(DRL)带来了快速实时生产决策的潜力,但在这些环境中,以减少交货延迟为目标的调度仍然是一个具有挑战性的问题。研究了一类具有作业插入的混合流车间动态调度问题,该问题的目的是使总加权延迟最小化。为了解决这一问题,提出了一种基于端到端的DRL方法——基于图关注网络的近端策略优化(PPO-GAT)。首先,建立了一个多智能体系统来模拟实际制造系统,作为实现智能生产调度的基础。然后,提出了一种新的基于图的状态表示方法来观察混合流车间的瞬时状态。设计了代表系统特征和作业特征的两个图模型,通过图注意网络(GAT)进行提取和融合,形成全局特征。然后,设计了一个多信号差分奖励(MSDR)函数来解决TWT目标引起的难以处理的奖励稀疏性。最后,通过烧蚀实验验证了所提算法的有效性,并与基准方法进行了比较。实验结果证明了该方法、MSDR和PPO-GAT的优越性。此外,PPO-GAT已被证明可以对任何规模的混合流车间进行实时调度决策,可以被认为是广泛工业应用的有前途的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Manufacturing Systems
Journal of Manufacturing Systems 工程技术-工程:工业
CiteScore
23.30
自引率
13.20%
发文量
216
审稿时长
25 days
期刊介绍: The Journal of Manufacturing Systems is dedicated to showcasing cutting-edge fundamental and applied research in manufacturing at the systems level. Encompassing products, equipment, people, information, control, and support functions, manufacturing systems play a pivotal role in the economical and competitive development, production, delivery, and total lifecycle of products, meeting market and societal needs. With a commitment to publishing archival scholarly literature, the journal strives to advance the state of the art in manufacturing systems and foster innovation in crafting efficient, robust, and sustainable manufacturing systems. The focus extends from equipment-level considerations to the broader scope of the extended enterprise. The Journal welcomes research addressing challenges across various scales, including nano, micro, and macro-scale manufacturing, and spanning diverse sectors such as aerospace, automotive, energy, and medical device manufacturing.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信