A Deep Reinforcement Learning Method Based on a Transformer Model for the Flexible Job Shop Scheduling Problem

IF 2.6 3区 工程技术 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Shuai Xu, Yanwu Li, Qiuyang Li
{"title":"A Deep Reinforcement Learning Method Based on a Transformer Model for the Flexible Job Shop Scheduling Problem","authors":"Shuai Xu, Yanwu Li, Qiuyang Li","doi":"10.3390/electronics13183696","DOIUrl":null,"url":null,"abstract":"The flexible job shop scheduling problem (FJSSP), which can significantly enhance production efficiency, is a mathematical optimization problem widely applied in modern manufacturing industries. However, due to its NP-hard nature, finding an optimal solution for all scenarios within a reasonable time frame faces serious challenges. This paper proposes a solution that transforms the FJSSP into a Markov Decision Process (MDP) and employs deep reinforcement learning (DRL) techniques for resolution. First, we represent the state features of the scheduling environment using seven feature vectors and utilize a transformer encoder as a feature extraction module to effectively capture the relationships between state features and enhance representation capability. Second, based on the features of the jobs and machines, we design 16 composite dispatching rules from multiple dimensions, including the job completion rate, processing time, waiting time, and manufacturing resource utilization, to achieve flexible and efficient scheduling decisions. Furthermore, we project an intuitive and dense reward function with the objective of minimizing the total idle time of machines. Finally, to verify the performance and feasibility of the algorithm, we evaluate the proposed policy model on the Brandimarte, Hurink, and Dauzere datasets. Our experimental results demonstrate that the proposed framework consistently outperforms traditional dispatching rules, surpasses metaheuristic methods on larger-scale instances, and exceeds the performance of existing DRL-based scheduling methods across most datasets.","PeriodicalId":11646,"journal":{"name":"Electronics","volume":"2 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3390/electronics13183696","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The flexible job shop scheduling problem (FJSSP), which can significantly enhance production efficiency, is a mathematical optimization problem widely applied in modern manufacturing industries. However, due to its NP-hard nature, finding an optimal solution for all scenarios within a reasonable time frame faces serious challenges. This paper proposes a solution that transforms the FJSSP into a Markov Decision Process (MDP) and employs deep reinforcement learning (DRL) techniques for resolution. First, we represent the state features of the scheduling environment using seven feature vectors and utilize a transformer encoder as a feature extraction module to effectively capture the relationships between state features and enhance representation capability. Second, based on the features of the jobs and machines, we design 16 composite dispatching rules from multiple dimensions, including the job completion rate, processing time, waiting time, and manufacturing resource utilization, to achieve flexible and efficient scheduling decisions. Furthermore, we project an intuitive and dense reward function with the objective of minimizing the total idle time of machines. Finally, to verify the performance and feasibility of the algorithm, we evaluate the proposed policy model on the Brandimarte, Hurink, and Dauzere datasets. Our experimental results demonstrate that the proposed framework consistently outperforms traditional dispatching rules, surpasses metaheuristic methods on larger-scale instances, and exceeds the performance of existing DRL-based scheduling methods across most datasets.
基于变压器模型的深度强化学习法,用于灵活的作业车间调度问题
灵活作业车间调度问题(FJSSP)能显著提高生产效率,是现代制造业广泛应用的数学优化问题。然而,由于 FJSSP 具有 NP 难的性质,要在合理的时间内找到适用于所有情况的最优解,面临着严峻的挑战。本文提出了一种解决方案,将 FJSSP 转化为马尔可夫决策过程(MDP),并采用深度强化学习(DRL)技术加以解决。首先,我们用七个特征向量表示调度环境的状态特征,并利用变换器编码器作为特征提取模块,有效捕捉状态特征之间的关系,增强表示能力。其次,根据作业和机器的特征,从作业完成率、处理时间、等待时间、制造资源利用率等多个维度设计出 16 条复合调度规则,实现灵活高效的调度决策。此外,我们还以最小化机器总闲置时间为目标,预测了一个直观且密集的奖励函数。最后,为了验证算法的性能和可行性,我们在 Brandimarte、Hurink 和 Dauzere 数据集上对提出的策略模型进行了评估。实验结果表明,在大多数数据集上,提议的框架始终优于传统的调度规则,在更大规模的实例上超越了元启发式方法,并超过了现有的基于 DRL 的调度方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Electronics
Electronics Computer Science-Computer Networks and Communications
CiteScore
1.10
自引率
10.30%
发文量
3515
审稿时长
16.71 days
期刊介绍: Electronics (ISSN 2079-9292; CODEN: ELECGJ) is an international, open access journal on the science of electronics and its applications published quarterly online by MDPI.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信