基于深度确定性策略梯度的柔性批处理协同双参与者框架

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-04-12 DOI:10.1016/j.neunet.2025.107461

Xindong Wang , Zidong Liu , Junghui Chen

{"title":"基于深度确定性策略梯度的柔性批处理协同双参与者框架","authors":"Xindong Wang , Zidong Liu , Junghui Chen","doi":"10.1016/j.neunet.2025.107461","DOIUrl":null,"url":null,"abstract":"<div><div>Due to its inherent efficiency in the process industry for achieving desired products, batch processing is widely acknowledged for its repetitive nature. Batch-to-batch learning control has traditionally been esteemed as a robust strategy for batch process control. However, the presence of flexible operating conditions in practical batch systems often leads to a lack of prior learning information, hindering learning control from optimizing performance. This article presents a novel approach to flexible batch process control using deep reinforcement learning (DRL) with twin actors. Specifically, a collaborative twin-actor-based deep deterministic policy gradient (CTA-DDPG) method is proposed to generate control policies and ensure safe operation across varying trial lengths and initial conditions. This approach involves the sequential construction of two sets of actor–critic networks with a shared critic. The first set explores meta-policy during an offline stage, while the second set enhances control performance using a supplementary agent during an online stage. To ensure robust policy transfer and efficient learning, a policy integration mechanism and a spatial–temporal experience replay strategy are incorporated, facilitating transfer stability and learning efficiency. The performance of CTA-DDPG is evaluated using both numerical examples and nonlinear injection molding process for tracking control. The results demonstrate the effectiveness and superiority of the proposed method in achieving desired control outcomes.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"188 ","pages":"Article 107461"},"PeriodicalIF":6.0000,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Collaborative twin actors framework using deep deterministic policy gradient for flexible batch processes\",\"authors\":\"Xindong Wang , Zidong Liu , Junghui Chen\",\"doi\":\"10.1016/j.neunet.2025.107461\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Due to its inherent efficiency in the process industry for achieving desired products, batch processing is widely acknowledged for its repetitive nature. Batch-to-batch learning control has traditionally been esteemed as a robust strategy for batch process control. However, the presence of flexible operating conditions in practical batch systems often leads to a lack of prior learning information, hindering learning control from optimizing performance. This article presents a novel approach to flexible batch process control using deep reinforcement learning (DRL) with twin actors. Specifically, a collaborative twin-actor-based deep deterministic policy gradient (CTA-DDPG) method is proposed to generate control policies and ensure safe operation across varying trial lengths and initial conditions. This approach involves the sequential construction of two sets of actor–critic networks with a shared critic. The first set explores meta-policy during an offline stage, while the second set enhances control performance using a supplementary agent during an online stage. To ensure robust policy transfer and efficient learning, a policy integration mechanism and a spatial–temporal experience replay strategy are incorporated, facilitating transfer stability and learning efficiency. The performance of CTA-DDPG is evaluated using both numerical examples and nonlinear injection molding process for tracking control. The results demonstrate the effectiveness and superiority of the proposed method in achieving desired control outcomes.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"188 \",\"pages\":\"Article 107461\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-04-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025003405\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025003405","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

由于在过程工业中实现所需产品的固有效率，批处理因其重复性而被广泛认可。批对批学习控制历来被认为是一种鲁棒的批过程控制策略。然而，在实际的批处理系统中，灵活的操作条件的存在往往导致缺乏先验学习信息，阻碍了学习控制的优化性能。本文提出了一种基于双参与者的深度强化学习（DRL）的柔性批处理控制新方法。具体而言，提出了一种基于协作双参与者的深度确定性策略梯度（CTA-DDPG）方法，以生成控制策略并确保在不同试验长度和初始条件下的安全运行。这种方法涉及两组具有共享评论家的演员-评论家网络的顺序构建。第一组在离线阶段探索元策略，而第二组在在线阶段使用补充代理增强控制性能。为保证策略迁移的鲁棒性和学习效率，本文引入了策略整合机制和时空经验重放策略，提高了迁移稳定性和学习效率。通过数值算例和非线性注射成型过程的跟踪控制，对CTA-DDPG的性能进行了评价。结果表明，该方法在达到预期控制效果方面具有有效性和优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Collaborative twin actors framework using deep deterministic policy gradient for flexible batch processes

Due to its inherent efficiency in the process industry for achieving desired products, batch processing is widely acknowledged for its repetitive nature. Batch-to-batch learning control has traditionally been esteemed as a robust strategy for batch process control. However, the presence of flexible operating conditions in practical batch systems often leads to a lack of prior learning information, hindering learning control from optimizing performance. This article presents a novel approach to flexible batch process control using deep reinforcement learning (DRL) with twin actors. Specifically, a collaborative twin-actor-based deep deterministic policy gradient (CTA-DDPG) method is proposed to generate control policies and ensure safe operation across varying trial lengths and initial conditions. This approach involves the sequential construction of two sets of actor–critic networks with a shared critic. The first set explores meta-policy during an offline stage, while the second set enhances control performance using a supplementary agent during an online stage. To ensure robust policy transfer and efficient learning, a policy integration mechanism and a spatial–temporal experience replay strategy are incorporated, facilitating transfer stability and learning efficiency. The performance of CTA-DDPG is evaluated using both numerical examples and nonlinear injection molding process for tracking control. The results demonstrate the effectiveness and superiority of the proposed method in achieving desired control outcomes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.