Deep reinforcement learning for machine scheduling: Methodology, the state-of-the-art, and future directions

IF 6.7 1区工程技术 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computers & Industrial Engineering Pub Date : 2025-02-01 DOI:10.1016/j.cie.2025.110856

Maziyar Khadivi , Todd Charter , Marjan Yaghoubi , Masoud Jalayer , Maryam Ahang , Ardeshir Shojaeinasab , Homayoun Najjaran

{"title":"Deep reinforcement learning for machine scheduling: Methodology, the state-of-the-art, and future directions","authors":"Maziyar Khadivi , Todd Charter , Marjan Yaghoubi , Masoud Jalayer , Maryam Ahang , Ardeshir Shojaeinasab , Homayoun Najjaran","doi":"10.1016/j.cie.2025.110856","DOIUrl":null,"url":null,"abstract":"<div><div>Machine scheduling aims to optimally assign jobs to a single or a group of machines while meeting manufacturing rules as well as job specifications. Optimizing the machine schedules leads to significant reduction in operational costs, adherence to customer demand, and rise in production efficiency. Despite its benefits for the industry, machine scheduling remains a challenging combinatorial optimization problem to be solved, inherently due to its Non-deterministic Polynomial-time (NP) hard nature. Deep Reinforcement Learning (DRL) has been regarded as a foundation for <em>“artificial general intelligence”</em> with promising results in tasks such as gaming and robotics. Researchers have also aimed to leverage the application of DRL, attributed to extraction of knowledge from data, across variety of machine scheduling problems since 1995. This paper presents a comprehensive review and comparison of the methodology, application, and the advantages and limitations of different DRL-based approaches. Further, the study categorizes the DRL methods based on the integrated computational components including conventional neural networks, encoder–decoder architectures, graph neural networks and metaheuristic algorithms. Our literature review concludes that the DRL-based approaches surpass the performance of exact solvers, heuristics, and tabular reinforcement learning algorithms in either computation speed, generating near-global optimal solutions, or both. They have been applied to static or dynamic scheduling of different machine environments, which consist of single machine, parallel machine, flow shop, job shop, and open shop, with different job characteristics. Nonetheless, the existing DRL-based schedulers face limitations not only in considering complex operational constraints, and configurable multi-objective optimization but also in dealing with generalization, scalability, intepretability, and robustness. Therefore, addressing these challenges shapes future work in this field. This paper serves the researchers to establish a proper investigation of state of the art and research gaps in DRL-based machine scheduling and can help the experts and practitioners choose the proper approach to implement DRL for production scheduling.</div></div>","PeriodicalId":55220,"journal":{"name":"Computers & Industrial Engineering","volume":"200 ","pages":"Article 110856"},"PeriodicalIF":6.7000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Industrial Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0360835225000014","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Machine scheduling aims to optimally assign jobs to a single or a group of machines while meeting manufacturing rules as well as job specifications. Optimizing the machine schedules leads to significant reduction in operational costs, adherence to customer demand, and rise in production efficiency. Despite its benefits for the industry, machine scheduling remains a challenging combinatorial optimization problem to be solved, inherently due to its Non-deterministic Polynomial-time (NP) hard nature. Deep Reinforcement Learning (DRL) has been regarded as a foundation for “artificial general intelligence” with promising results in tasks such as gaming and robotics. Researchers have also aimed to leverage the application of DRL, attributed to extraction of knowledge from data, across variety of machine scheduling problems since 1995. This paper presents a comprehensive review and comparison of the methodology, application, and the advantages and limitations of different DRL-based approaches. Further, the study categorizes the DRL methods based on the integrated computational components including conventional neural networks, encoder–decoder architectures, graph neural networks and metaheuristic algorithms. Our literature review concludes that the DRL-based approaches surpass the performance of exact solvers, heuristics, and tabular reinforcement learning algorithms in either computation speed, generating near-global optimal solutions, or both. They have been applied to static or dynamic scheduling of different machine environments, which consist of single machine, parallel machine, flow shop, job shop, and open shop, with different job characteristics. Nonetheless, the existing DRL-based schedulers face limitations not only in considering complex operational constraints, and configurable multi-objective optimization but also in dealing with generalization, scalability, intepretability, and robustness. Therefore, addressing these challenges shapes future work in this field. This paper serves the researchers to establish a proper investigation of state of the art and research gaps in DRL-based machine scheduling and can help the experts and practitioners choose the proper approach to implement DRL for production scheduling.

查看原文本刊更多论文

机器排程的目的是在满足生产规则和作业规格的前提下，将作业最优化地分配给一台或一组机器。优化机器调度可显著降低运营成本，满足客户需求，提高生产效率。尽管机器调度对行业大有裨益，但由于其具有非确定性多项式时间（NP）的困难性质，它仍然是一个极具挑战性的组合优化问题。深度强化学习（DRL）被视为 "人工通用智能 "的基础，在游戏和机器人等任务中取得了可喜的成果。自 1995 年以来，研究人员还致力于在各种机器调度问题中利用 DRL 的应用，因为 DRL 的作用是从数据中提取知识。本文全面回顾和比较了基于 DRL 的不同方法的方法论、应用、优势和局限性。此外，研究还根据集成计算组件对 DRL 方法进行了分类，包括传统神经网络、编码器-解码器架构、图神经网络和元启发式算法。我们的文献综述得出结论，基于 DRL 的方法在计算速度、生成近全局最优解或两者方面都超过了精确求解器、启发式算法和表格强化学习算法。它们已被应用于不同机器环境的静态或动态调度，这些环境包括单机、并行机器、流水车间、作业车间和开放式车间，具有不同的作业特征。然而，现有的基于 DRL 的调度器不仅在考虑复杂操作约束和可配置多目标优化方面存在局限性，而且在处理通用性、可扩展性、可解释性和鲁棒性方面也存在局限性。因此，应对这些挑战是该领域未来工作的重点。本文有助于研究人员对基于 DRL 的机器调度的最新技术和研究空白进行适当的调查，并能帮助专家和从业人员选择适当的方法来实施 DRL 的生产调度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers & Industrial Engineering 工程技术-工程：工业

CiteScore

12.70

自引率

12.70%

发文量

794

审稿时长

10.6 months

期刊介绍： Computers & Industrial Engineering (CAIE) is dedicated to researchers, educators, and practitioners in industrial engineering and related fields. Pioneering the integration of computers in research, education, and practice, industrial engineering has evolved to make computers and electronic communication integral to its domain. CAIE publishes original contributions focusing on the development of novel computerized methodologies to address industrial engineering problems. It also highlights the applications of these methodologies to issues within the broader industrial engineering and associated communities. The journal actively encourages submissions that push the boundaries of fundamental theories and concepts in industrial engineering techniques.