线性时滞系统的学习自适应最优控制器*

2023 American Control Conference (ACC) Pub Date : 2023-05-31 DOI:10.23919/ACC55779.2023.10156108

Leilei Cui, Bo Pang, Zhong-Ping Jiang

{"title":"线性时滞系统的学习自适应最优控制器*","authors":"Leilei Cui, Bo Pang, Zhong-Ping Jiang","doi":"10.23919/ACC55779.2023.10156108","DOIUrl":null,"url":null,"abstract":"This paper studies the learning-based optimal control for a class of infinite-dimensional linear time-delay systems. The aim is to fill the gap of adaptive dynamic programming (ADP) where adaptive optimal control of infinite-dimensional systems is not addressed. A key strategy is to combine the classical model-based linear quadratic (LQ) optimal control of time-delay systems with the state-of-art reinforcement learning (RL) technique. Both the model-based and data-driven policy iteration (PI) approaches are proposed to solve the corresponding algebraic Riccati equation (ARE) with guaranteed convergence. The proposed PI algorithm can be considered as a generalization of ADP to infinite-dimensional time-delay systems. The efficiency of the proposed algorithm is demonstrated by the practical application arising from autonomous driving in mixed traffic environments, where human drivers’ reaction delay is considered.","PeriodicalId":397401,"journal":{"name":"2023 American Control Conference (ACC)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning Adaptive Optimal Controllers for Linear Time-Delay Systems *\",\"authors\":\"Leilei Cui, Bo Pang, Zhong-Ping Jiang\",\"doi\":\"10.23919/ACC55779.2023.10156108\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper studies the learning-based optimal control for a class of infinite-dimensional linear time-delay systems. The aim is to fill the gap of adaptive dynamic programming (ADP) where adaptive optimal control of infinite-dimensional systems is not addressed. A key strategy is to combine the classical model-based linear quadratic (LQ) optimal control of time-delay systems with the state-of-art reinforcement learning (RL) technique. Both the model-based and data-driven policy iteration (PI) approaches are proposed to solve the corresponding algebraic Riccati equation (ARE) with guaranteed convergence. The proposed PI algorithm can be considered as a generalization of ADP to infinite-dimensional time-delay systems. The efficiency of the proposed algorithm is demonstrated by the practical application arising from autonomous driving in mixed traffic environments, where human drivers’ reaction delay is considered.\",\"PeriodicalId\":397401,\"journal\":{\"name\":\"2023 American Control Conference (ACC)\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 American Control Conference (ACC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/ACC55779.2023.10156108\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 American Control Conference (ACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ACC55779.2023.10156108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

研究了一类无限维线性时滞系统的基于学习的最优控制问题。其目的是填补自适应动态规划(ADP)中不解决无限维系统自适应最优控制问题的空白。一个关键的策略是将经典的基于模型的线性二次(LQ)时滞系统最优控制与最新的强化学习(RL)技术相结合。提出了基于模型和数据驱动的策略迭代(PI)方法来求解相应的保证收敛的代数Riccati方程(are)。所提出的PI算法可以看作是ADP在无限维时滞系统中的推广。在混合交通环境下自动驾驶的实际应用中，考虑了人类驾驶员的反应延迟，验证了算法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning Adaptive Optimal Controllers for Linear Time-Delay Systems *

This paper studies the learning-based optimal control for a class of infinite-dimensional linear time-delay systems. The aim is to fill the gap of adaptive dynamic programming (ADP) where adaptive optimal control of infinite-dimensional systems is not addressed. A key strategy is to combine the classical model-based linear quadratic (LQ) optimal control of time-delay systems with the state-of-art reinforcement learning (RL) technique. Both the model-based and data-driven policy iteration (PI) approaches are proposed to solve the corresponding algebraic Riccati equation (ARE) with guaranteed convergence. The proposed PI algorithm can be considered as a generalization of ADP to infinite-dimensional time-delay systems. The efficiency of the proposed algorithm is demonstrated by the practical application arising from autonomous driving in mixed traffic environments, where human drivers’ reaction delay is considered.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 American Control Conference (ACC)

自引率

0.00%

发文量