强化孪生：从数字孪生到基于模型的强化学习

IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Journal of Computational Science Pub Date : 2024-08-14 DOI:10.1016/j.jocs.2024.102421

Lorenzo Schena , Pedro A. Marques , Romain Poletti , Samuel Ahizi , Jan Van den Berghe , Miguel A. Mendez

{"title":"强化孪生：从数字孪生到基于模型的强化学习","authors":"Lorenzo Schena , Pedro A. Marques , Romain Poletti , Samuel Ahizi , Jan Van den Berghe , Miguel A. Mendez","doi":"10.1016/j.jocs.2024.102421","DOIUrl":null,"url":null,"abstract":"<div><p>The concept of digital twins promises to revolutionize engineering by offering new avenues for optimization, control, and predictive maintenance. We propose a novel framework for simultaneously training the digital twin of an engineering system and an associated control agent. The training of the twin combines methods from adjoint-based data assimilation and system identification, while the training of the control agent combines model-based optimal control and model-free reinforcement learning. The training of the control agent is achieved by letting it evolve independently along two paths: one driven by a model-based optimal control and another driven by reinforcement learning. The virtual environment offered by the digital twin is used as a playground for confrontation and indirect interaction. This interaction occurs as an “expert demonstrator”, where the best policy is selected for the interaction with the real environment and “cloned” to the other if the independent training stagnates. We refer to this framework as Reinforcement Twinning (RT). The framework is tested on three vastly different engineering systems and control tasks, namely (1) the control of a wind turbine subject to time-varying wind speed, (2) the trajectory control of flapping-wing micro air vehicles (FWMAVs) subject to wind gusts, and (3) the mitigation of thermal loads in the management of cryogenic storage tanks. The test cases are implemented using simplified models for which the ground truth on the closure law is available. The results show that the adjoint-based training of the digital twin is remarkably sample-efficient and completed within a few iterations. Concerning the control agent training, the results show that the model-based and the model-free control training benefit from the learning experience and the complementary learning approach of each other. The encouraging results open the path towards implementing the RT framework on real systems.</p></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"82 ","pages":"Article 102421"},"PeriodicalIF":3.7000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement Twinning: From digital twins to model-based reinforcement learning\",\"authors\":\"Lorenzo Schena , Pedro A. Marques , Romain Poletti , Samuel Ahizi , Jan Van den Berghe , Miguel A. Mendez\",\"doi\":\"10.1016/j.jocs.2024.102421\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The concept of digital twins promises to revolutionize engineering by offering new avenues for optimization, control, and predictive maintenance. We propose a novel framework for simultaneously training the digital twin of an engineering system and an associated control agent. The training of the twin combines methods from adjoint-based data assimilation and system identification, while the training of the control agent combines model-based optimal control and model-free reinforcement learning. The training of the control agent is achieved by letting it evolve independently along two paths: one driven by a model-based optimal control and another driven by reinforcement learning. The virtual environment offered by the digital twin is used as a playground for confrontation and indirect interaction. This interaction occurs as an “expert demonstrator”, where the best policy is selected for the interaction with the real environment and “cloned” to the other if the independent training stagnates. We refer to this framework as Reinforcement Twinning (RT). The framework is tested on three vastly different engineering systems and control tasks, namely (1) the control of a wind turbine subject to time-varying wind speed, (2) the trajectory control of flapping-wing micro air vehicles (FWMAVs) subject to wind gusts, and (3) the mitigation of thermal loads in the management of cryogenic storage tanks. The test cases are implemented using simplified models for which the ground truth on the closure law is available. The results show that the adjoint-based training of the digital twin is remarkably sample-efficient and completed within a few iterations. Concerning the control agent training, the results show that the model-based and the model-free control training benefit from the learning experience and the complementary learning approach of each other. The encouraging results open the path towards implementing the RT framework on real systems.</p></div>\",\"PeriodicalId\":48907,\"journal\":{\"name\":\"Journal of Computational Science\",\"volume\":\"82 \",\"pages\":\"Article 102421\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computational Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S187775032400214X\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Science","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S187775032400214X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

数字孪生的概念为优化、控制和预测性维护提供了新的途径，有望彻底改变工程学。我们提出了一种新颖的框架，用于同时训练工程系统的数字孪生和相关控制代理。孪生系统的训练结合了基于邻接的数据同化和系统识别方法，而控制代理的训练则结合了基于模型的优化控制和无模型强化学习。控制代理的训练是通过让它沿着两条路径独立发展来实现的：一条路径由基于模型的最优控制驱动，另一条路径由强化学习驱动。数字孪生提供的虚拟环境被用作对抗和间接互动的场所。这种互动以 "专家示范 "的方式进行，在与真实环境的互动中选择最佳策略，如果独立训练停滞不前，则将其 "克隆 "到另一个人身上。我们将这一框架称为 "强化孪生"（RT）。该框架在三个截然不同的工程系统和控制任务中进行了测试，即（1）受时变风速影响的风力涡轮机控制；（2）受阵风影响的拍翼式微型飞行器（FWMAVs）的轨迹控制；以及（3）在低温储罐管理中减轻热负荷。这些测试用例都是使用简化模型实现的，这些模型可以获得闭合定律的基本事实。结果表明，基于邻接法的数字孪生训练具有显著的采样效率，只需几次迭代即可完成。在控制代理训练方面，结果表明基于模型和无模型的控制训练都受益于彼此的学习经验和互补学习方法。这些令人鼓舞的结果为在实际系统中实施 RT 框架开辟了道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reinforcement Twinning: From digital twins to model-based reinforcement learning

The concept of digital twins promises to revolutionize engineering by offering new avenues for optimization, control, and predictive maintenance. We propose a novel framework for simultaneously training the digital twin of an engineering system and an associated control agent. The training of the twin combines methods from adjoint-based data assimilation and system identification, while the training of the control agent combines model-based optimal control and model-free reinforcement learning. The training of the control agent is achieved by letting it evolve independently along two paths: one driven by a model-based optimal control and another driven by reinforcement learning. The virtual environment offered by the digital twin is used as a playground for confrontation and indirect interaction. This interaction occurs as an “expert demonstrator”, where the best policy is selected for the interaction with the real environment and “cloned” to the other if the independent training stagnates. We refer to this framework as Reinforcement Twinning (RT). The framework is tested on three vastly different engineering systems and control tasks, namely (1) the control of a wind turbine subject to time-varying wind speed, (2) the trajectory control of flapping-wing micro air vehicles (FWMAVs) subject to wind gusts, and (3) the mitigation of thermal loads in the management of cryogenic storage tanks. The test cases are implemented using simplified models for which the ground truth on the closure law is available. The results show that the adjoint-based training of the digital twin is remarkably sample-efficient and completed within a few iterations. Concerning the control agent training, the results show that the model-based and the model-free control training benefit from the learning experience and the complementary learning approach of each other. The encouraging results open the path towards implementing the RT framework on real systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Computational Science COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-COMPUTER SCIENCE, THEORY & METHODS

CiteScore

5.50

自引率

3.00%

发文量

227

审稿时长

41 days

期刊介绍： Computational Science is a rapidly growing multi- and interdisciplinary field that uses advanced computing and data analysis to understand and solve complex problems. It has reached a level of predictive capability that now firmly complements the traditional pillars of experimentation and theory. The recent advances in experimental techniques such as detectors, on-line sensor networks and high-resolution imaging techniques, have opened up new windows into physical and biological processes at many levels of detail. The resulting data explosion allows for detailed data driven modeling and simulation. This new discipline in science combines computational thinking, modern computational methods, devices and collateral technologies to address problems far beyond the scope of traditional numerical methods. Computational science typically unifies three distinct elements: • Modeling, Algorithms and Simulations (e.g. numerical and non-numerical, discrete and continuous); • Software developed to solve science (e.g., biological, physical, and social), engineering, medicine, and humanities problems; • Computer and information science that develops and optimizes the advanced system hardware, software, networking, and data management components (e.g. problem solving environments).