A Homotopy Method for Continuous-Time Model-Free LQR Control Based on Policy Iteration

IF 19.2 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS
Wenwu Fan;Junlin Xiong
{"title":"A Homotopy Method for Continuous-Time Model-Free LQR Control Based on Policy Iteration","authors":"Wenwu Fan;Junlin Xiong","doi":"10.1109/JAS.2025.125132","DOIUrl":null,"url":null,"abstract":"In recent years, reinforcement learning control theory has been well developed. However, model-free value iteration needs many iterations to achieve the desired precision, and model-free policy iteration requires an initial stabilizing control policy. It is significant to propose a fast model-free algorithm to solve the continuous-time linear quadratic control problem without an initial stabilizing control policy. In this paper, we construct a homotopy path on which each point corresponds to an linear quadratic regulator problem. Based on policy iteration, model-based and model-free homotopy algorithms are proposed to solve the optimal control problem of continuous-time linear systems along the homotopy path. Our algorithms are speeded up using first-order differential information and do not require an initial stabilizing control policy. Finally, several practical examples are used to illustrate our results.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"12 8","pages":"1673-1682"},"PeriodicalIF":19.2000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ieee-Caa Journal of Automatica Sinica","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10916676/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

In recent years, reinforcement learning control theory has been well developed. However, model-free value iteration needs many iterations to achieve the desired precision, and model-free policy iteration requires an initial stabilizing control policy. It is significant to propose a fast model-free algorithm to solve the continuous-time linear quadratic control problem without an initial stabilizing control policy. In this paper, we construct a homotopy path on which each point corresponds to an linear quadratic regulator problem. Based on policy iteration, model-based and model-free homotopy algorithms are proposed to solve the optimal control problem of continuous-time linear systems along the homotopy path. Our algorithms are speeded up using first-order differential information and do not require an initial stabilizing control policy. Finally, several practical examples are used to illustrate our results.
基于策略迭代的连续时间无模型LQR控制同伦方法
近年来,强化学习控制理论得到了很好的发展。然而,无模型值迭代需要多次迭代才能达到期望的精度,无模型策略迭代需要初始稳定控制策略。对于无初始稳定控制策略的连续时间线性二次控制问题,提出一种快速无模型算法具有重要意义。本文构造了一个同伦路径,其上每个点对应于一个线性二次型调节问题。针对连续时间线性系统沿同伦路径的最优控制问题,提出了基于模型和无模型的策略迭代算法。我们的算法使用一阶微分信息来加速,并且不需要初始稳定控制策略。最后,用几个实际的例子来说明我们的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Ieee-Caa Journal of Automatica Sinica
Ieee-Caa Journal of Automatica Sinica Engineering-Control and Systems Engineering
CiteScore
23.50
自引率
11.00%
发文量
880
期刊介绍: The IEEE/CAA Journal of Automatica Sinica is a reputable journal that publishes high-quality papers in English on original theoretical/experimental research and development in the field of automation. The journal covers a wide range of topics including automatic control, artificial intelligence and intelligent control, systems theory and engineering, pattern recognition and intelligent systems, automation engineering and applications, information processing and information systems, network-based automation, robotics, sensing and measurement, and navigation, guidance, and control. Additionally, the journal is abstracted/indexed in several prominent databases including SCIE (Science Citation Index Expanded), EI (Engineering Index), Inspec, Scopus, SCImago, DBLP, CNKI (China National Knowledge Infrastructure), CSCD (Chinese Science Citation Database), and IEEE Xplore.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信