离散马尔可夫跳变线性系统的无模型最优控制：一种q -学习方法

IF 4.2 3区计算机科学 Q2 AUTOMATION & CONTROL SYSTEMS

Journal of The Franklin Institute-engineering and Applied Mathematics Pub Date : 2025-06-25 DOI:10.1016/j.jfranklin.2025.107784

Ehsan Badfar, Babak Tavassoli

{"title":"离散马尔可夫跳变线性系统的无模型最优控制：一种q -学习方法","authors":"Ehsan Badfar, Babak Tavassoli","doi":"10.1016/j.jfranklin.2025.107784","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents a model-free optimal control strategy for discrete-time Markovian Jump Linear Systems (MJLS) using a Q-learning-based reinforcement learning (RL) framework. Conventional model-based control techniques for MJLS rely on full knowledge of system dynamics and the solution of coupled algebraic Riccati equations (CARE), which may not be feasible in many practical scenarios. To overcome this limitation, we propose a novel Q-function formulation that explicitly incorporates the Markovian switching behavior of the system. An off-policy Q-learning algorithm is developed to estimate the kernel matrix of the Q-function directly from raw input-state data, enabling the computation of optimal controller gains without requiring system models. We rigorously prove that the learned controller gains converge to those of the model-based optimal controller, thereby ensuring mean-square stability. Simulation results on a networked control system with Markovian packet losses demonstrate the convergence, stability, and practical effectiveness of the proposed model-free controller.</div></div>","PeriodicalId":17283,"journal":{"name":"Journal of The Franklin Institute-engineering and Applied Mathematics","volume":"362 12","pages":"Article 107784"},"PeriodicalIF":4.2000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Model-free optimal control for discrete-time Markovian jump linear systems: A Q-learning approach\",\"authors\":\"Ehsan Badfar, Babak Tavassoli\",\"doi\":\"10.1016/j.jfranklin.2025.107784\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper presents a model-free optimal control strategy for discrete-time Markovian Jump Linear Systems (MJLS) using a Q-learning-based reinforcement learning (RL) framework. Conventional model-based control techniques for MJLS rely on full knowledge of system dynamics and the solution of coupled algebraic Riccati equations (CARE), which may not be feasible in many practical scenarios. To overcome this limitation, we propose a novel Q-function formulation that explicitly incorporates the Markovian switching behavior of the system. An off-policy Q-learning algorithm is developed to estimate the kernel matrix of the Q-function directly from raw input-state data, enabling the computation of optimal controller gains without requiring system models. We rigorously prove that the learned controller gains converge to those of the model-based optimal controller, thereby ensuring mean-square stability. Simulation results on a networked control system with Markovian packet losses demonstrate the convergence, stability, and practical effectiveness of the proposed model-free controller.</div></div>\",\"PeriodicalId\":17283,\"journal\":{\"name\":\"Journal of The Franklin Institute-engineering and Applied Mathematics\",\"volume\":\"362 12\",\"pages\":\"Article 107784\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of The Franklin Institute-engineering and Applied Mathematics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0016003225002777\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of The Franklin Institute-engineering and Applied Mathematics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0016003225002777","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

利用基于q学习的强化学习框架，提出了离散马尔可夫跳变线性系统（MJLS）的无模型最优控制策略。传统的基于模型的MJLS控制技术依赖于充分的系统动力学知识和耦合代数Riccati方程（CARE）的解，这在许多实际场景中可能是不可用的。为了克服这一限制，我们提出了一种新的q函数公式，该公式明确地包含了系统的马尔可夫切换行为。开发了一种非策略q学习算法，直接从原始输入状态数据估计q函数的核矩阵，从而无需系统模型即可计算最优控制器增益。我们严格地证明了学习控制器的增益收敛于基于模型的最优控制器的增益，从而保证了均方稳定性。在具有马尔可夫丢包的网络控制系统上的仿真结果表明了该无模型控制器的收敛性、稳定性和实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Model-free optimal control for discrete-time Markovian jump linear systems: A Q-learning approach

This paper presents a model-free optimal control strategy for discrete-time Markovian Jump Linear Systems (MJLS) using a Q-learning-based reinforcement learning (RL) framework. Conventional model-based control techniques for MJLS rely on full knowledge of system dynamics and the solution of coupled algebraic Riccati equations (CARE), which may not be feasible in many practical scenarios. To overcome this limitation, we propose a novel Q-function formulation that explicitly incorporates the Markovian switching behavior of the system. An off-policy Q-learning algorithm is developed to estimate the kernel matrix of the Q-function directly from raw input-state data, enabling the computation of optimal controller gains without requiring system models. We rigorously prove that the learned controller gains converge to those of the model-based optimal controller, thereby ensuring mean-square stability. Simulation results on a networked control system with Markovian packet losses demonstrate the convergence, stability, and practical effectiveness of the proposed model-free controller.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of The Franklin Institute-engineering and Applied Mathematics 工程技术-工程：电子与电气

CiteScore

7.30

自引率

14.60%

发文量

586

审稿时长

6.9 months

期刊介绍： The Journal of The Franklin Institute has an established reputation for publishing high-quality papers in the field of engineering and applied mathematics. Its current focus is on control systems, complex networks and dynamic systems, signal processing and communications and their applications. All submitted papers are peer-reviewed. The Journal will publish original research papers and research review papers of substance. Papers and special focus issues are judged upon possible lasting value, which has been and continues to be the strength of the Journal of The Franklin Institute.