具有收敛保证的随机LQR的值迭代

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE transactions on neural networks and learning systems Pub Date : 2025-04-28 DOI:10.1109/TNNLS.2025.3558738

Jing Lai;Junlin Xiong;Yu Kang

{"title":"具有收敛保证的随机LQR的值迭代","authors":"Jing Lai;Junlin Xiong;Yu Kang","doi":"10.1109/TNNLS.2025.3558738","DOIUrl":null,"url":null,"abstract":"This brief studies the discounted stochastic linear quadratic regulator (LQR) problem for systems suffering from additive noise of unknown mean. A completely model-free (MF) value iteration (VI) algorithm is developed to learn the optimal control policy using off-line system trajectories. The generated control policies are proven to converge to a small neighborhood of the optimal ones with high probability. In addition, an MF algorithm is proposed to learn a feasible discount factor. The proposed MF algorithms are illustrated through several examples.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 6","pages":"11640-11649"},"PeriodicalIF":8.9000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Value Iteration for Stochastic LQR With Convergence Guarantees\",\"authors\":\"Jing Lai;Junlin Xiong;Yu Kang\",\"doi\":\"10.1109/TNNLS.2025.3558738\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This brief studies the discounted stochastic linear quadratic regulator (LQR) problem for systems suffering from additive noise of unknown mean. A completely model-free (MF) value iteration (VI) algorithm is developed to learn the optimal control policy using off-line system trajectories. The generated control policies are proven to converge to a small neighborhood of the optimal ones with high probability. In addition, an MF algorithm is proposed to learn a feasible discount factor. The proposed MF algorithms are illustrated through several examples.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"36 6\",\"pages\":\"11640-11649\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10977962/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10977962/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

本文研究了具有未知平均值加性噪声的系统的折现随机线性二次型调节器问题。提出了一种完全无模型（MF）值迭代（VI）算法，利用脱机系统轨迹学习最优控制策略。证明了所生成的控制策略以高概率收敛于最优策略的小邻域。此外，提出了一种MF算法来学习可行的折现因子。通过几个例子说明了所提出的MF算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Value Iteration for Stochastic LQR With Convergence Guarantees

This brief studies the discounted stochastic linear quadratic regulator (LQR) problem for systems suffering from additive noise of unknown mean. A completely model-free (MF) value iteration (VI) algorithm is developed to learn the optimal control policy using off-line system trajectories. The generated control policies are proven to converge to a small neighborhood of the optimal ones with high probability. In addition, an MF algorithm is proposed to learn a feasible discount factor. The proposed MF algorithms are illustrated through several examples.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.