{"title":"离散马尔可夫跳变线性系统的无模型最优控制:一种q -学习方法","authors":"Ehsan Badfar, Babak Tavassoli","doi":"10.1016/j.jfranklin.2025.107784","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents a model-free optimal control strategy for discrete-time Markovian Jump Linear Systems (MJLS) using a Q-learning-based reinforcement learning (RL) framework. Conventional model-based control techniques for MJLS rely on full knowledge of system dynamics and the solution of coupled algebraic Riccati equations (CARE), which may not be feasible in many practical scenarios. To overcome this limitation, we propose a novel Q-function formulation that explicitly incorporates the Markovian switching behavior of the system. An off-policy Q-learning algorithm is developed to estimate the kernel matrix of the Q-function directly from raw input-state data, enabling the computation of optimal controller gains without requiring system models. We rigorously prove that the learned controller gains converge to those of the model-based optimal controller, thereby ensuring mean-square stability. Simulation results on a networked control system with Markovian packet losses demonstrate the convergence, stability, and practical effectiveness of the proposed model-free controller.</div></div>","PeriodicalId":17283,"journal":{"name":"Journal of The Franklin Institute-engineering and Applied Mathematics","volume":"362 12","pages":"Article 107784"},"PeriodicalIF":4.2000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Model-free optimal control for discrete-time Markovian jump linear systems: A Q-learning approach\",\"authors\":\"Ehsan Badfar, Babak Tavassoli\",\"doi\":\"10.1016/j.jfranklin.2025.107784\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper presents a model-free optimal control strategy for discrete-time Markovian Jump Linear Systems (MJLS) using a Q-learning-based reinforcement learning (RL) framework. Conventional model-based control techniques for MJLS rely on full knowledge of system dynamics and the solution of coupled algebraic Riccati equations (CARE), which may not be feasible in many practical scenarios. To overcome this limitation, we propose a novel Q-function formulation that explicitly incorporates the Markovian switching behavior of the system. An off-policy Q-learning algorithm is developed to estimate the kernel matrix of the Q-function directly from raw input-state data, enabling the computation of optimal controller gains without requiring system models. We rigorously prove that the learned controller gains converge to those of the model-based optimal controller, thereby ensuring mean-square stability. Simulation results on a networked control system with Markovian packet losses demonstrate the convergence, stability, and practical effectiveness of the proposed model-free controller.</div></div>\",\"PeriodicalId\":17283,\"journal\":{\"name\":\"Journal of The Franklin Institute-engineering and Applied Mathematics\",\"volume\":\"362 12\",\"pages\":\"Article 107784\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of The Franklin Institute-engineering and Applied Mathematics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0016003225002777\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of The Franklin Institute-engineering and Applied Mathematics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0016003225002777","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Model-free optimal control for discrete-time Markovian jump linear systems: A Q-learning approach
This paper presents a model-free optimal control strategy for discrete-time Markovian Jump Linear Systems (MJLS) using a Q-learning-based reinforcement learning (RL) framework. Conventional model-based control techniques for MJLS rely on full knowledge of system dynamics and the solution of coupled algebraic Riccati equations (CARE), which may not be feasible in many practical scenarios. To overcome this limitation, we propose a novel Q-function formulation that explicitly incorporates the Markovian switching behavior of the system. An off-policy Q-learning algorithm is developed to estimate the kernel matrix of the Q-function directly from raw input-state data, enabling the computation of optimal controller gains without requiring system models. We rigorously prove that the learned controller gains converge to those of the model-based optimal controller, thereby ensuring mean-square stability. Simulation results on a networked control system with Markovian packet losses demonstrate the convergence, stability, and practical effectiveness of the proposed model-free controller.
期刊介绍:
The Journal of The Franklin Institute has an established reputation for publishing high-quality papers in the field of engineering and applied mathematics. Its current focus is on control systems, complex networks and dynamic systems, signal processing and communications and their applications. All submitted papers are peer-reviewed. The Journal will publish original research papers and research review papers of substance. Papers and special focus issues are judged upon possible lasting value, which has been and continues to be the strength of the Journal of The Franklin Institute.