几乎可以肯定$\sqrt{T}$后悔自适应LQR

IF 7 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Automatic Control Pub Date : 2025-01-29 DOI:10.1109/TAC.2025.3535997

Yiwen Lu;Yilin Mo

{"title":"几乎可以肯定$\\sqrt{T}$后悔自适应LQR","authors":"Yiwen Lu;Yilin Mo","doi":"10.1109/TAC.2025.3535997","DOIUrl":null,"url":null,"abstract":"The linear-quadratic regulation (LQR) problem with unknown system parameters has been widely studied, but it has remained unclear whether <inline-formula><tex-math>$\\tilde{ \\mathcal {O}}(\\sqrt{T})$</tex-math></inline-formula> regret, which is the best known dependence on time, can be achieved almost surely. In this article, we propose an adaptive LQR controller with almost surely <inline-formula><tex-math>$\\tilde{ \\mathcal {O}}(\\sqrt{T})$</tex-math></inline-formula> regret upper bound. The controller features a circuit-breaking mechanism, which falls back to a known stabilizing controller when the input signal or control gain is large, thereby circumventing potential safety breach and ensuring the convergence of the system parameter estimate. Meanwhile, the circuit-breaking is shown to be triggered only finitely often, and hence has a negligible effect on the asymptotic performance of the controller. The proposed controller is also validated via simulation on Tennessee Eastman process (TEP), a commonly used industrial process example.","PeriodicalId":13201,"journal":{"name":"IEEE Transactions on Automatic Control","volume":"70 8","pages":"5145-5159"},"PeriodicalIF":7.0000,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Almost Surely $\\\\sqrt{T}$ Regret for Adaptive LQR\",\"authors\":\"Yiwen Lu;Yilin Mo\",\"doi\":\"10.1109/TAC.2025.3535997\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The linear-quadratic regulation (LQR) problem with unknown system parameters has been widely studied, but it has remained unclear whether <inline-formula><tex-math>$\\\\tilde{ \\\\mathcal {O}}(\\\\sqrt{T})$</tex-math></inline-formula> regret, which is the best known dependence on time, can be achieved almost surely. In this article, we propose an adaptive LQR controller with almost surely <inline-formula><tex-math>$\\\\tilde{ \\\\mathcal {O}}(\\\\sqrt{T})$</tex-math></inline-formula> regret upper bound. The controller features a circuit-breaking mechanism, which falls back to a known stabilizing controller when the input signal or control gain is large, thereby circumventing potential safety breach and ensuring the convergence of the system parameter estimate. Meanwhile, the circuit-breaking is shown to be triggered only finitely often, and hence has a negligible effect on the asymptotic performance of the controller. The proposed controller is also validated via simulation on Tennessee Eastman process (TEP), a commonly used industrial process example.\",\"PeriodicalId\":13201,\"journal\":{\"name\":\"IEEE Transactions on Automatic Control\",\"volume\":\"70 8\",\"pages\":\"5145-5159\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2025-01-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Automatic Control\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10857470/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automatic Control","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10857470/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

具有未知系统参数的线性二次调节（LQR）问题已经得到了广泛的研究，但目前尚不清楚$\tilde{ \mathcal {O}}(\sqrt{T})$遗憾是否可以几乎肯定地实现，这是众所周知的对时间的依赖。在本文中，我们提出了一种几乎肯定具有$\tilde{ \mathcal {O}}(\sqrt{T})$遗憾上界的自适应LQR控制器。该控制器具有断路机制，当输入信号或控制增益较大时，该控制器退回到已知的稳定控制器，从而避免了潜在的安全漏洞，保证了系统参数估计的收敛性。同时，断路器的触发次数有限，因此对控制器的渐近性能的影响可以忽略不计。该控制器还通过田纳西州伊士曼过程（TEP）的仿真验证，这是一个常用的工业过程实例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Almost Surely $\sqrt{T}$ Regret for Adaptive LQR

The linear-quadratic regulation (LQR) problem with unknown system parameters has been widely studied, but it has remained unclear whether

$\tilde{ \mathcal {O}}(\sqrt{T})$

regret, which is the best known dependence on time, can be achieved almost surely. In this article, we propose an adaptive LQR controller with almost surely

$\tilde{ \mathcal {O}}(\sqrt{T})$

regret upper bound. The controller features a circuit-breaking mechanism, which falls back to a known stabilizing controller when the input signal or control gain is large, thereby circumventing potential safety breach and ensuring the convergence of the system parameter estimate. Meanwhile, the circuit-breaking is shown to be triggered only finitely often, and hence has a negligible effect on the asymptotic performance of the controller. The proposed controller is also validated via simulation on Tennessee Eastman process (TEP), a commonly used industrial process example.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Automatic Control 工程技术-工程：电子与电气

CiteScore

11.30

自引率

5.90%

发文量

824

审稿时长

9 months

期刊介绍： In the IEEE Transactions on Automatic Control, the IEEE Control Systems Society publishes high-quality papers on the theory, design, and applications of control engineering. Two types of contributions are regularly considered: 1) Papers: Presentation of significant research, development, or application of control concepts. 2) Technical Notes and Correspondence: Brief technical notes, comments on published areas or established control topics, corrections to papers and notes published in the Transactions. In addition, special papers (tutorials, surveys, and perspectives on the theory and applications of control systems topics) are solicited.