Reinforcement learning-guided two-stage optimization framework for multi-product batch scheduling

IF 3.9 2区 工程技术 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Jiawen Zhu , Wenli Du , Chen Fan , Muyi Huang , Chuan Wang , Furong Zhang
{"title":"Reinforcement learning-guided two-stage optimization framework for multi-product batch scheduling","authors":"Jiawen Zhu ,&nbsp;Wenli Du ,&nbsp;Chen Fan ,&nbsp;Muyi Huang ,&nbsp;Chuan Wang ,&nbsp;Furong Zhang","doi":"10.1016/j.compchemeng.2025.109399","DOIUrl":null,"url":null,"abstract":"<div><div>With the increasing demand for high-end and fine manufacturing, multi-product batch scheduling has become essential in process industries. Its inherent complexity stems from hybrid decision variables and tightly coupled constraints. To address these challenges, this study proposes a two-stage optimization framework that integrates reinforcement learning (RL) and mathematical programming (MP). The RL layer determines batch allocations and production sequences, which are then transmitted as time windows within which the MP layer optimizes continuous variables to ensure feasibility. To handle hybrid action spaces, a mapping mechanism is introduced to unify discrete and continuous decisions. In addition, dynamic short-term targets based on reformulated constraints are designed to address the sparsity of rewards caused by long-horizon objectives. Experiments on polyolefin production scheduling demonstrate that the proposed method outperforms MP and standalone RL in terms of economic profit, production stability, and computational performance.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"204 ","pages":"Article 109399"},"PeriodicalIF":3.9000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135425004028","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

With the increasing demand for high-end and fine manufacturing, multi-product batch scheduling has become essential in process industries. Its inherent complexity stems from hybrid decision variables and tightly coupled constraints. To address these challenges, this study proposes a two-stage optimization framework that integrates reinforcement learning (RL) and mathematical programming (MP). The RL layer determines batch allocations and production sequences, which are then transmitted as time windows within which the MP layer optimizes continuous variables to ensure feasibility. To handle hybrid action spaces, a mapping mechanism is introduced to unify discrete and continuous decisions. In addition, dynamic short-term targets based on reformulated constraints are designed to address the sparsity of rewards caused by long-horizon objectives. Experiments on polyolefin production scheduling demonstrate that the proposed method outperforms MP and standalone RL in terms of economic profit, production stability, and computational performance.
基于强化学习的多产品批调度两阶段优化框架
随着对高端和精细制造需求的不断增长,多产品批量调度在过程工业中变得必不可少。其固有的复杂性源于混合决策变量和紧密耦合约束。为了应对这些挑战,本研究提出了一个集成强化学习(RL)和数学规划(MP)的两阶段优化框架。RL层确定批分配和生产顺序,然后作为时间窗口传输,MP层在此时间窗口内优化连续变量以确保可行性。为了处理混合动作空间,引入映射机制统一离散决策和连续决策。此外,基于重新制定约束的动态短期目标是为了解决长期目标造成的奖励稀疏性。对聚烯烃生产调度的实验表明,该方法在经济效益、生产稳定性和计算性能等方面都优于MP和独立RL。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computers & Chemical Engineering
Computers & Chemical Engineering 工程技术-工程:化工
CiteScore
8.70
自引率
14.00%
发文量
374
审稿时长
70 days
期刊介绍: Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信