基于逆最优控制的离散线性系统逆强化学习。

IF 6.3 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

ISA transactions Pub Date : 2025-05-19 DOI:10.1016/j.isatra.2025.04.027

Jiashun Huang, Dengguo Xu, Yahui Li, Xiang Zhang, Jingling Zhao

{"title":"基于逆最优控制的离散线性系统逆强化学习。","authors":"Jiashun Huang, Dengguo Xu, Yahui Li, Xiang Zhang, Jingling Zhao","doi":"10.1016/j.isatra.2025.04.027","DOIUrl":null,"url":null,"abstract":"<div><div>This paper mainly deals with inverse reinforcement learning (IRL) for discrete-time linear time-invariant systems. Based on input and state measurement data from expert agent, several algorithms are proposed to reconstruct cost function in optimal control problem. The algorithms mainly consist of three steps, namely updating control gain via algebraic Riccati equation (ARE), gradient descent to correct cost matrix, and updating weight matrix based on inverse optimal control (IOC). First, by reformulating gain formula of optimal control in the learner system, we present a model-based IRL algorithm. When the system model is fully known, the cost function can be iteratively computed. Then, we develop a partially model-free IRL framework for reconstructing the cost function by introducing auxiliary control inputs and decomposing the algorithm into outer and inner loop. Therefore, in the case where the input matrix is unknown, weight matrix in the cost function is reconstructed. Moreover, the convergence of the algorithms and the stability of corresponding closed-loop system have been demonstrated. Finally, simulations verify the effectiveness of the proposed IRL algorithms.</div></div>","PeriodicalId":14660,"journal":{"name":"ISA transactions","volume":"163 ","pages":"Pages 108-119"},"PeriodicalIF":6.3000,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Inverse reinforcement learning for discrete-time linear systems based on inverse optimal control\",\"authors\":\"Jiashun Huang, Dengguo Xu, Yahui Li, Xiang Zhang, Jingling Zhao\",\"doi\":\"10.1016/j.isatra.2025.04.027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper mainly deals with inverse reinforcement learning (IRL) for discrete-time linear time-invariant systems. Based on input and state measurement data from expert agent, several algorithms are proposed to reconstruct cost function in optimal control problem. The algorithms mainly consist of three steps, namely updating control gain via algebraic Riccati equation (ARE), gradient descent to correct cost matrix, and updating weight matrix based on inverse optimal control (IOC). First, by reformulating gain formula of optimal control in the learner system, we present a model-based IRL algorithm. When the system model is fully known, the cost function can be iteratively computed. Then, we develop a partially model-free IRL framework for reconstructing the cost function by introducing auxiliary control inputs and decomposing the algorithm into outer and inner loop. Therefore, in the case where the input matrix is unknown, weight matrix in the cost function is reconstructed. Moreover, the convergence of the algorithms and the stability of corresponding closed-loop system have been demonstrated. Finally, simulations verify the effectiveness of the proposed IRL algorithms.</div></div>\",\"PeriodicalId\":14660,\"journal\":{\"name\":\"ISA transactions\",\"volume\":\"163 \",\"pages\":\"Pages 108-119\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISA transactions\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0019057825002150\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISA transactions","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0019057825002150","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

本文主要研究离散线性定常系统的逆强化学习（IRL）。基于专家智能体的输入和状态测量数据，提出了几种重构最优控制问题中代价函数的算法。该算法主要包括三个步骤，即通过代数Riccati方程（ARE）更新控制增益、梯度下降修正代价矩阵和基于逆最优控制（IOC）更新权矩阵。首先，通过重新表述学习系统中最优控制的增益公式，提出了一种基于模型的IRL算法。当系统模型完全已知时，可以迭代计算成本函数。然后，我们开发了一个部分无模型的IRL框架，通过引入辅助控制输入并将算法分解为外部和内部循环来重建成本函数。因此，在输入矩阵未知的情况下，重构代价函数中的权矩阵。此外，还证明了算法的收敛性和相应闭环系统的稳定性。最后，通过仿真验证了所提IRL算法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Inverse reinforcement learning for discrete-time linear systems based on inverse optimal control

This paper mainly deals with inverse reinforcement learning (IRL) for discrete-time linear time-invariant systems. Based on input and state measurement data from expert agent, several algorithms are proposed to reconstruct cost function in optimal control problem. The algorithms mainly consist of three steps, namely updating control gain via algebraic Riccati equation (ARE), gradient descent to correct cost matrix, and updating weight matrix based on inverse optimal control (IOC). First, by reformulating gain formula of optimal control in the learner system, we present a model-based IRL algorithm. When the system model is fully known, the cost function can be iteratively computed. Then, we develop a partially model-free IRL framework for reconstructing the cost function by introducing auxiliary control inputs and decomposing the algorithm into outer and inner loop. Therefore, in the case where the input matrix is unknown, weight matrix in the cost function is reconstructed. Moreover, the convergence of the algorithms and the stability of corresponding closed-loop system have been demonstrated. Finally, simulations verify the effectiveness of the proposed IRL algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ISA transactions 工程技术-工程：综合

CiteScore

11.70

自引率

12.30%

发文量

824

审稿时长

4.4 months

期刊介绍： ISA Transactions serves as a platform for showcasing advancements in measurement and automation, catering to both industrial practitioners and applied researchers. It covers a wide array of topics within measurement, including sensors, signal processing, data analysis, and fault detection, supported by techniques such as artificial intelligence and communication systems. Automation topics encompass control strategies, modelling, system reliability, and maintenance, alongside optimization and human-machine interaction. The journal targets research and development professionals in control systems, process instrumentation, and automation from academia and industry.