Dynamic flexible flow shop scheduling via cross-attention networks and multi-agent reinforcement learning

IF 14.2 1区工程技术 Q1 ENGINEERING, INDUSTRIAL

Journal of Manufacturing Systems Pub Date : 2025-03-26 DOI:10.1016/j.jmsy.2025.03.005

Jinlong Zheng , Yixin Zhao , Yinya Li , Jianfeng Li , Liangeng Wang , Di Yuan

{"title":"Dynamic flexible flow shop scheduling via cross-attention networks and multi-agent reinforcement learning","authors":"Jinlong Zheng , Yixin Zhao , Yinya Li , Jianfeng Li , Liangeng Wang , Di Yuan","doi":"10.1016/j.jmsy.2025.03.005","DOIUrl":null,"url":null,"abstract":"<div><div>With the increasing uncertainty in production environments and changes in market demand, flexible and efficient scheduling solutions have become particularly critical. However, existing research mainly focuses on static scheduling or relatively simple dynamic scheduling problems, which are inadequate to address the complexities of actual production processes. This paper considers the dynamic flexible flow shop scheduling problem (DFFSP) characterized by diverse processes, complexity, and high flexibility, and proposes a multi-agent reinforcement learning algorithm based on cross-attention networks (MARL_CA). First, this paper proposes a novel state feature representation method, which represents the job processing data and the production Gantt chart as a state matrix, fully reflecting the environment state in the scheduling process. In addition, a cross-attention network is proposed to extract state features, enabling efficient discovery of complex relationships between jobs and machines, thereby enhancing the model's ability to understand intricate features. The model is trained using an independent proximal policy optimization (IPPO) based on the actor-critic method to help agents learn accurate and efficient scheduling strategies. Experimental results on a large number of static and dynamic scheduling instances demonstrate that the proposed algorithm outperforms traditional heuristic rules and other advanced algorithms, exhibiting strong learning efficiency and generalization capability.</div></div>","PeriodicalId":16227,"journal":{"name":"Journal of Manufacturing Systems","volume":"80 ","pages":"Pages 395-411"},"PeriodicalIF":14.2000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Manufacturing Systems","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0278612525000652","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}

引用次数: 0

Abstract

With the increasing uncertainty in production environments and changes in market demand, flexible and efficient scheduling solutions have become particularly critical. However, existing research mainly focuses on static scheduling or relatively simple dynamic scheduling problems, which are inadequate to address the complexities of actual production processes. This paper considers the dynamic flexible flow shop scheduling problem (DFFSP) characterized by diverse processes, complexity, and high flexibility, and proposes a multi-agent reinforcement learning algorithm based on cross-attention networks (MARL_CA). First, this paper proposes a novel state feature representation method, which represents the job processing data and the production Gantt chart as a state matrix, fully reflecting the environment state in the scheduling process. In addition, a cross-attention network is proposed to extract state features, enabling efficient discovery of complex relationships between jobs and machines, thereby enhancing the model's ability to understand intricate features. The model is trained using an independent proximal policy optimization (IPPO) based on the actor-critic method to help agents learn accurate and efficient scheduling strategies. Experimental results on a large number of static and dynamic scheduling instances demonstrate that the proposed algorithm outperforms traditional heuristic rules and other advanced algorithms, exhibiting strong learning efficiency and generalization capability.

查看原文本刊更多论文

基于交叉关注网络和多智能体强化学习的动态柔性流水车间调度

随着生产环境的不确定性和市场需求的变化，灵活高效的调度解决方案变得尤为重要。然而，现有的研究主要集中在静态调度或相对简单的动态调度问题上，不足以解决实际生产过程的复杂性。针对流程多、复杂、灵活性高的动态柔性流水车间调度问题，提出了一种基于交叉注意网络的多智能体强化学习算法（MARL_CA）。首先，本文提出了一种新的状态特征表示方法，将作业加工数据和生产甘特图表示为状态矩阵，充分反映了调度过程中的环境状态。此外，提出了一种交叉关注网络来提取状态特征，能够有效地发现作业和机器之间的复杂关系，从而增强模型对复杂特征的理解能力。该模型采用基于行为者批评方法的独立近端策略优化（IPPO）进行训练，以帮助智能体学习准确有效的调度策略。在大量静态和动态调度实例上的实验结果表明，该算法优于传统的启发式规则和其他先进算法，具有较强的学习效率和泛化能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Manufacturing Systems 工程技术-工程：工业

CiteScore

23.30

自引率

13.20%

发文量

216

审稿时长

25 days

期刊介绍： The Journal of Manufacturing Systems is dedicated to showcasing cutting-edge fundamental and applied research in manufacturing at the systems level. Encompassing products, equipment, people, information, control, and support functions, manufacturing systems play a pivotal role in the economical and competitive development, production, delivery, and total lifecycle of products, meeting market and societal needs. With a commitment to publishing archival scholarly literature, the journal strives to advance the state of the art in manufacturing systems and foster innovation in crafting efficient, robust, and sustainable manufacturing systems. The focus extends from equipment-level considerations to the broader scope of the extended enterprise. The Journal welcomes research addressing challenges across various scales, including nano, micro, and macro-scale manufacturing, and spanning diverse sectors such as aerospace, automotive, energy, and medical device manufacturing.