A Homotopy Method for Continuous-Time Model-Free LQR Control Based on Policy Iteration

IF 19.2 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Ieee-Caa Journal of Automatica Sinica Pub Date : 2025-03-07 DOI:10.1109/JAS.2025.125132

Wenwu Fan;Junlin Xiong

引用次数: 0

Abstract

In recent years, reinforcement learning control theory has been well developed. However, model-free value iteration needs many iterations to achieve the desired precision, and model-free policy iteration requires an initial stabilizing control policy. It is significant to propose a fast model-free algorithm to solve the continuous-time linear quadratic control problem without an initial stabilizing control policy. In this paper, we construct a homotopy path on which each point corresponds to an linear quadratic regulator problem. Based on policy iteration, model-based and model-free homotopy algorithms are proposed to solve the optimal control problem of continuous-time linear systems along the homotopy path. Our algorithms are speeded up using first-order differential information and do not require an initial stabilizing control policy. Finally, several practical examples are used to illustrate our results.

查看原文本刊更多论文

基于策略迭代的连续时间无模型LQR控制同伦方法

近年来，强化学习控制理论得到了很好的发展。然而，无模型值迭代需要多次迭代才能达到期望的精度，无模型策略迭代需要初始稳定控制策略。对于无初始稳定控制策略的连续时间线性二次控制问题，提出一种快速无模型算法具有重要意义。本文构造了一个同伦路径，其上每个点对应于一个线性二次型调节问题。针对连续时间线性系统沿同伦路径的最优控制问题，提出了基于模型和无模型的策略迭代算法。我们的算法使用一阶微分信息来加速，并且不需要初始稳定控制策略。最后，用几个实际的例子来说明我们的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Ieee-Caa Journal of Automatica Sinica Engineering-Control and Systems Engineering

CiteScore

23.50

自引率

11.00%

发文量

880

期刊介绍： The IEEE/CAA Journal of Automatica Sinica is a reputable journal that publishes high-quality papers in English on original theoretical/experimental research and development in the field of automation. The journal covers a wide range of topics including automatic control, artificial intelligence and intelligent control, systems theory and engineering, pattern recognition and intelligent systems, automation engineering and applications, information processing and information systems, network-based automation, robotics, sensing and measurement, and navigation, guidance, and control. Additionally, the journal is abstracted/indexed in several prominent databases including SCIE (Science Citation Index Expanded), EI (Engineering Index), Inspec, Scopus, SCImago, DBLP, CNKI (China National Knowledge Infrastructure), CSCD (Chinese Science Citation Database), and IEEE Xplore.