基于深度强化学习的分布式异构混合流车间阻塞约束多目标调度

IF 10.1 1区 工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY
Xueyan Sun , Weiming Shen , Jiaxin Fan , Birgit Vogel-Heuser , Fandi Bi , Chunjiang Zhang
{"title":"基于深度强化学习的分布式异构混合流车间阻塞约束多目标调度","authors":"Xueyan Sun ,&nbsp;Weiming Shen ,&nbsp;Jiaxin Fan ,&nbsp;Birgit Vogel-Heuser ,&nbsp;Fandi Bi ,&nbsp;Chunjiang Zhang","doi":"10.1016/j.eng.2024.11.033","DOIUrl":null,"url":null,"abstract":"<div><div>This paper investigates a distributed heterogeneous hybrid blocking flow-shop scheduling problem (DHHBFSP) designed to minimize the total tardiness and total energy consumption simultaneously, and proposes an improved proximal policy optimization (IPPO) method to make real-time decisions for the DHHBFSP. A multi-objective Markov decision process is modeled for the DHHBFSP, where the reward function is represented by a vector with dynamic weights instead of the common objective-related scalar value. A factory agent (FA) is formulated for each factory to select unscheduled jobs and is trained by the proposed IPPO to improve the decision quality. Multiple FAs work asynchronously to allocate jobs that arrive randomly at the shop. A two-stage training strategy is introduced in the IPPO, which learns from both single- and dual-policy data for better data utilization. The proposed IPPO is tested on randomly generated instances and compared with variants of the basic proximal policy optimization (PPO), dispatch rules, multi-objective metaheuristics, and multi-agent reinforcement learning methods. Extensive experimental results suggest that the proposed strategies offer significant improvements to the basic PPO, and the proposed IPPO outperforms the state-of-the-art scheduling methods in both convergence and solution quality.</div></div>","PeriodicalId":11783,"journal":{"name":"Engineering","volume":"46 ","pages":"Pages 278-291"},"PeriodicalIF":10.1000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Reinforcement Learning-based Multi-Objective Scheduling for Distributed Heterogeneous Hybrid Flow Shops with Blocking Constraints\",\"authors\":\"Xueyan Sun ,&nbsp;Weiming Shen ,&nbsp;Jiaxin Fan ,&nbsp;Birgit Vogel-Heuser ,&nbsp;Fandi Bi ,&nbsp;Chunjiang Zhang\",\"doi\":\"10.1016/j.eng.2024.11.033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper investigates a distributed heterogeneous hybrid blocking flow-shop scheduling problem (DHHBFSP) designed to minimize the total tardiness and total energy consumption simultaneously, and proposes an improved proximal policy optimization (IPPO) method to make real-time decisions for the DHHBFSP. A multi-objective Markov decision process is modeled for the DHHBFSP, where the reward function is represented by a vector with dynamic weights instead of the common objective-related scalar value. A factory agent (FA) is formulated for each factory to select unscheduled jobs and is trained by the proposed IPPO to improve the decision quality. Multiple FAs work asynchronously to allocate jobs that arrive randomly at the shop. A two-stage training strategy is introduced in the IPPO, which learns from both single- and dual-policy data for better data utilization. The proposed IPPO is tested on randomly generated instances and compared with variants of the basic proximal policy optimization (PPO), dispatch rules, multi-objective metaheuristics, and multi-agent reinforcement learning methods. Extensive experimental results suggest that the proposed strategies offer significant improvements to the basic PPO, and the proposed IPPO outperforms the state-of-the-art scheduling methods in both convergence and solution quality.</div></div>\",\"PeriodicalId\":11783,\"journal\":{\"name\":\"Engineering\",\"volume\":\"46 \",\"pages\":\"Pages 278-291\"},\"PeriodicalIF\":10.1000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2095809924007264\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2095809924007264","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

研究了以总延迟和总能耗同时最小化为目标的分布式异构混合阻塞流车间调度问题,提出了一种改进的近端策略优化(IPPO)方法对该问题进行实时决策。针对dhhbbfsp模型建立了一个多目标马尔可夫决策过程,其中奖励函数由一个具有动态权重的向量来表示,而不是普通的与目标相关的标量值。为每个工厂制定一个工厂代理(FA)来选择计划外工作,并由拟议的IPPO进行培训,以提高决策质量。多个fa异步工作以分配随机到达车间的作业。在IPPO中引入了一种两阶段的训练策略,它从单策略和双策略数据中学习,以更好地利用数据。该方法在随机生成的实例上进行了测试,并与基本近端策略优化(PPO)、调度规则、多目标元启发式和多智能体强化学习方法的变体进行了比较。大量的实验结果表明,所提出的策略对基本调度策略有显著的改进,并且所提出的调度策略在收敛性和解决方案质量方面都优于当前最先进的调度方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Deep Reinforcement Learning-based Multi-Objective Scheduling for Distributed Heterogeneous Hybrid Flow Shops with Blocking Constraints
This paper investigates a distributed heterogeneous hybrid blocking flow-shop scheduling problem (DHHBFSP) designed to minimize the total tardiness and total energy consumption simultaneously, and proposes an improved proximal policy optimization (IPPO) method to make real-time decisions for the DHHBFSP. A multi-objective Markov decision process is modeled for the DHHBFSP, where the reward function is represented by a vector with dynamic weights instead of the common objective-related scalar value. A factory agent (FA) is formulated for each factory to select unscheduled jobs and is trained by the proposed IPPO to improve the decision quality. Multiple FAs work asynchronously to allocate jobs that arrive randomly at the shop. A two-stage training strategy is introduced in the IPPO, which learns from both single- and dual-policy data for better data utilization. The proposed IPPO is tested on randomly generated instances and compared with variants of the basic proximal policy optimization (PPO), dispatch rules, multi-objective metaheuristics, and multi-agent reinforcement learning methods. Extensive experimental results suggest that the proposed strategies offer significant improvements to the basic PPO, and the proposed IPPO outperforms the state-of-the-art scheduling methods in both convergence and solution quality.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Engineering
Engineering Environmental Science-Environmental Engineering
自引率
1.60%
发文量
335
审稿时长
35 days
期刊介绍: Engineering, an international open-access journal initiated by the Chinese Academy of Engineering (CAE) in 2015, serves as a distinguished platform for disseminating cutting-edge advancements in engineering R&D, sharing major research outputs, and highlighting key achievements worldwide. The journal's objectives encompass reporting progress in engineering science, fostering discussions on hot topics, addressing areas of interest, challenges, and prospects in engineering development, while considering human and environmental well-being and ethics in engineering. It aims to inspire breakthroughs and innovations with profound economic and social significance, propelling them to advanced international standards and transforming them into a new productive force. Ultimately, this endeavor seeks to bring about positive changes globally, benefit humanity, and shape a new future.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信