Data-Driven Optimization-Based Cost and Optimal Control Inference

IF 2 Q2 AUTOMATION & CONTROL SYSTEMS

IEEE Control Systems Letters Pub Date : 2025-07-01 DOI:10.1109/LCSYS.2025.3584907

Jiacheng Wu;Wenqian Xue;Frank L. Lewis;Bosen Lian

引用次数: 0

Abstract

This letter develops a novel optimization-based inverse reinforcement learning (RL) control algorithm that infers the minimal cost from observed demonstrations via optimization-based policy evaluation and update. The core idea is the simultaneous evaluation of the value function matrix and cost weight during policy evaluation under a given control policy, which simplifies the algorithmic structure and reduces the iterations required for convergence. Based on this idea, we first develop a model-based algorithm with detailed implementation steps, and analyze the monotonicity and convergence properties of the cost weight. Then, based on Willems’ lemma, we develop a data-driven algorithm to learn an equivalent weight matrix from persistently excited (PE) data. We also prove the convergence of the data-driven algorithm and show that the converged results learned from PE data are unbiased. Finally, simulations on a power system are carried out to demonstrate the effectiveness of the proposed inverse RL algorithm.

查看原文本刊更多论文

基于数据驱动优化的成本与最优控制推理

本文开发了一种新的基于优化的逆强化学习（RL）控制算法，该算法通过基于优化的策略评估和更新，从观察到的演示中推断出最小成本。其核心思想是在给定控制策略下，在策略评估过程中同时评估价值函数矩阵和成本权值，从而简化了算法结构，减少了收敛所需的迭代次数。基于这一思想，我们首先开发了一种基于模型的算法，并给出了详细的实现步骤，分析了代价权值的单调性和收敛性。然后，基于Willems引理，提出了一种数据驱动算法，从持续激发（PE）数据中学习等效权矩阵。我们还证明了数据驱动算法的收敛性，并证明了从PE数据中学习到的收敛结果是无偏的。最后，在电力系统上进行了仿真，验证了所提逆RL算法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Control Systems Letters Mathematics-Control and Optimization

CiteScore

4.40

自引率

13.30%

发文量

471