Reinforcement learning-guided two-stage optimization framework for multi-product batch scheduling

IF 3.9 2区工程技术 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computers & Chemical Engineering Pub Date : 2025-09-13 DOI:10.1016/j.compchemeng.2025.109399

Jiawen Zhu , Wenli Du , Chen Fan , Muyi Huang , Chuan Wang , Furong Zhang

{"title":"Reinforcement learning-guided two-stage optimization framework for multi-product batch scheduling","authors":"Jiawen Zhu , Wenli Du , Chen Fan , Muyi Huang , Chuan Wang , Furong Zhang","doi":"10.1016/j.compchemeng.2025.109399","DOIUrl":null,"url":null,"abstract":"<div><div>With the increasing demand for high-end and fine manufacturing, multi-product batch scheduling has become essential in process industries. Its inherent complexity stems from hybrid decision variables and tightly coupled constraints. To address these challenges, this study proposes a two-stage optimization framework that integrates reinforcement learning (RL) and mathematical programming (MP). The RL layer determines batch allocations and production sequences, which are then transmitted as time windows within which the MP layer optimizes continuous variables to ensure feasibility. To handle hybrid action spaces, a mapping mechanism is introduced to unify discrete and continuous decisions. In addition, dynamic short-term targets based on reformulated constraints are designed to address the sparsity of rewards caused by long-horizon objectives. Experiments on polyolefin production scheduling demonstrate that the proposed method outperforms MP and standalone RL in terms of economic profit, production stability, and computational performance.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"204 ","pages":"Article 109399"},"PeriodicalIF":3.9000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135425004028","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

With the increasing demand for high-end and fine manufacturing, multi-product batch scheduling has become essential in process industries. Its inherent complexity stems from hybrid decision variables and tightly coupled constraints. To address these challenges, this study proposes a two-stage optimization framework that integrates reinforcement learning (RL) and mathematical programming (MP). The RL layer determines batch allocations and production sequences, which are then transmitted as time windows within which the MP layer optimizes continuous variables to ensure feasibility. To handle hybrid action spaces, a mapping mechanism is introduced to unify discrete and continuous decisions. In addition, dynamic short-term targets based on reformulated constraints are designed to address the sparsity of rewards caused by long-horizon objectives. Experiments on polyolefin production scheduling demonstrate that the proposed method outperforms MP and standalone RL in terms of economic profit, production stability, and computational performance.

查看原文本刊更多论文

基于强化学习的多产品批调度两阶段优化框架

随着对高端和精细制造需求的不断增长，多产品批量调度在过程工业中变得必不可少。其固有的复杂性源于混合决策变量和紧密耦合约束。为了应对这些挑战，本研究提出了一个集成强化学习（RL）和数学规划（MP）的两阶段优化框架。RL层确定批分配和生产顺序，然后作为时间窗口传输，MP层在此时间窗口内优化连续变量以确保可行性。为了处理混合动作空间，引入映射机制统一离散决策和连续决策。此外，基于重新制定约束的动态短期目标是为了解决长期目标造成的奖励稀疏性。对聚烯烃生产调度的实验表明，该方法在经济效益、生产稳定性和计算性能等方面都优于MP和独立RL。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers & Chemical Engineering 工程技术-工程：化工

CiteScore

8.70

自引率

14.00%

发文量

374

审稿时长

70 days

期刊介绍： Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.