离散马尔可夫跳变线性系统的无模型最优控制:一种q -学习方法

IF 4.2 3区 计算机科学 Q2 AUTOMATION & CONTROL SYSTEMS
Ehsan Badfar, Babak Tavassoli
{"title":"离散马尔可夫跳变线性系统的无模型最优控制:一种q -学习方法","authors":"Ehsan Badfar,&nbsp;Babak Tavassoli","doi":"10.1016/j.jfranklin.2025.107784","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents a model-free optimal control strategy for discrete-time Markovian Jump Linear Systems (MJLS) using a Q-learning-based reinforcement learning (RL) framework. Conventional model-based control techniques for MJLS rely on full knowledge of system dynamics and the solution of coupled algebraic Riccati equations (CARE), which may not be feasible in many practical scenarios. To overcome this limitation, we propose a novel Q-function formulation that explicitly incorporates the Markovian switching behavior of the system. An off-policy Q-learning algorithm is developed to estimate the kernel matrix of the Q-function directly from raw input-state data, enabling the computation of optimal controller gains without requiring system models. We rigorously prove that the learned controller gains converge to those of the model-based optimal controller, thereby ensuring mean-square stability. Simulation results on a networked control system with Markovian packet losses demonstrate the convergence, stability, and practical effectiveness of the proposed model-free controller.</div></div>","PeriodicalId":17283,"journal":{"name":"Journal of The Franklin Institute-engineering and Applied Mathematics","volume":"362 12","pages":"Article 107784"},"PeriodicalIF":4.2000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Model-free optimal control for discrete-time Markovian jump linear systems: A Q-learning approach\",\"authors\":\"Ehsan Badfar,&nbsp;Babak Tavassoli\",\"doi\":\"10.1016/j.jfranklin.2025.107784\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper presents a model-free optimal control strategy for discrete-time Markovian Jump Linear Systems (MJLS) using a Q-learning-based reinforcement learning (RL) framework. Conventional model-based control techniques for MJLS rely on full knowledge of system dynamics and the solution of coupled algebraic Riccati equations (CARE), which may not be feasible in many practical scenarios. To overcome this limitation, we propose a novel Q-function formulation that explicitly incorporates the Markovian switching behavior of the system. An off-policy Q-learning algorithm is developed to estimate the kernel matrix of the Q-function directly from raw input-state data, enabling the computation of optimal controller gains without requiring system models. We rigorously prove that the learned controller gains converge to those of the model-based optimal controller, thereby ensuring mean-square stability. Simulation results on a networked control system with Markovian packet losses demonstrate the convergence, stability, and practical effectiveness of the proposed model-free controller.</div></div>\",\"PeriodicalId\":17283,\"journal\":{\"name\":\"Journal of The Franklin Institute-engineering and Applied Mathematics\",\"volume\":\"362 12\",\"pages\":\"Article 107784\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of The Franklin Institute-engineering and Applied Mathematics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0016003225002777\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of The Franklin Institute-engineering and Applied Mathematics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0016003225002777","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

利用基于q学习的强化学习框架,提出了离散马尔可夫跳变线性系统(MJLS)的无模型最优控制策略。传统的基于模型的MJLS控制技术依赖于充分的系统动力学知识和耦合代数Riccati方程(CARE)的解,这在许多实际场景中可能是不可用的。为了克服这一限制,我们提出了一种新的q函数公式,该公式明确地包含了系统的马尔可夫切换行为。开发了一种非策略q学习算法,直接从原始输入状态数据估计q函数的核矩阵,从而无需系统模型即可计算最优控制器增益。我们严格地证明了学习控制器的增益收敛于基于模型的最优控制器的增益,从而保证了均方稳定性。在具有马尔可夫丢包的网络控制系统上的仿真结果表明了该无模型控制器的收敛性、稳定性和实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Model-free optimal control for discrete-time Markovian jump linear systems: A Q-learning approach
This paper presents a model-free optimal control strategy for discrete-time Markovian Jump Linear Systems (MJLS) using a Q-learning-based reinforcement learning (RL) framework. Conventional model-based control techniques for MJLS rely on full knowledge of system dynamics and the solution of coupled algebraic Riccati equations (CARE), which may not be feasible in many practical scenarios. To overcome this limitation, we propose a novel Q-function formulation that explicitly incorporates the Markovian switching behavior of the system. An off-policy Q-learning algorithm is developed to estimate the kernel matrix of the Q-function directly from raw input-state data, enabling the computation of optimal controller gains without requiring system models. We rigorously prove that the learned controller gains converge to those of the model-based optimal controller, thereby ensuring mean-square stability. Simulation results on a networked control system with Markovian packet losses demonstrate the convergence, stability, and practical effectiveness of the proposed model-free controller.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.30
自引率
14.60%
发文量
586
审稿时长
6.9 months
期刊介绍: The Journal of The Franklin Institute has an established reputation for publishing high-quality papers in the field of engineering and applied mathematics. Its current focus is on control systems, complex networks and dynamic systems, signal processing and communications and their applications. All submitted papers are peer-reviewed. The Journal will publish original research papers and research review papers of substance. Papers and special focus issues are judged upon possible lasting value, which has been and continues to be the strength of the Journal of The Franklin Institute.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信