{"title":"AutoRL framework for bioprocess control: Optimizing reward function, architecture, and hyperparameters","authors":"D.A. Goulart , R.D. Pereira , F.V. Silva","doi":"10.1016/j.dche.2025.100261","DOIUrl":null,"url":null,"abstract":"<div><div>This study proposes a structured AutoRL framework for the development of deep reinforcement learning (DRL) controllers in chemical process systems. Focusing on the control of a 3<span><math><mo>×</mo></math></span> 3 MIMO yeast fermentation bioreactor, the methodology jointly optimizes three key internal components of the DRL agent: the reward function, the neural network architecture, and the hyperparameters of the algorithm. A parameterizable logistic reward formulation is introduced to encode control objectives, such as steady-state accuracy, minimalization of actuation effort, and control smoothness, into a flexible and tunable structure. A dual loop optimization strategy combines grid search and Bayesian optimization to systematically explore and refine the agent’s design space. The resulting controller achieved average steady-state errors of 0.009 °C for reactor temperature and 0.19 g/L for ethanol concentration, while maintaining smooth and stable behavior under diverse operational scenarios. By formalizing reward design and integrating it with hyperparameter and architecture optimization, this work delivers a AutoRL methodology for DRL-based control, reducing reliance on expert heuristics and enhancing reproducibility in complex bioprocess applications.</div></div>","PeriodicalId":72815,"journal":{"name":"Digital Chemical Engineering","volume":"16 ","pages":"Article 100261"},"PeriodicalIF":4.1000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Chemical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772508125000456","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0
Abstract
This study proposes a structured AutoRL framework for the development of deep reinforcement learning (DRL) controllers in chemical process systems. Focusing on the control of a 3 3 MIMO yeast fermentation bioreactor, the methodology jointly optimizes three key internal components of the DRL agent: the reward function, the neural network architecture, and the hyperparameters of the algorithm. A parameterizable logistic reward formulation is introduced to encode control objectives, such as steady-state accuracy, minimalization of actuation effort, and control smoothness, into a flexible and tunable structure. A dual loop optimization strategy combines grid search and Bayesian optimization to systematically explore and refine the agent’s design space. The resulting controller achieved average steady-state errors of 0.009 °C for reactor temperature and 0.19 g/L for ethanol concentration, while maintaining smooth and stable behavior under diverse operational scenarios. By formalizing reward design and integrating it with hyperparameter and architecture optimization, this work delivers a AutoRL methodology for DRL-based control, reducing reliance on expert heuristics and enhancing reproducibility in complex bioprocess applications.