{"title":"几乎可以肯定$\\sqrt{T}$后悔自适应LQR","authors":"Yiwen Lu;Yilin Mo","doi":"10.1109/TAC.2025.3535997","DOIUrl":null,"url":null,"abstract":"The linear-quadratic regulation (LQR) problem with unknown system parameters has been widely studied, but it has remained unclear whether <inline-formula><tex-math>$\\tilde{ \\mathcal {O}}(\\sqrt{T})$</tex-math></inline-formula> regret, which is the best known dependence on time, can be achieved almost surely. In this article, we propose an adaptive LQR controller with almost surely <inline-formula><tex-math>$\\tilde{ \\mathcal {O}}(\\sqrt{T})$</tex-math></inline-formula> regret upper bound. The controller features a circuit-breaking mechanism, which falls back to a known stabilizing controller when the input signal or control gain is large, thereby circumventing potential safety breach and ensuring the convergence of the system parameter estimate. Meanwhile, the circuit-breaking is shown to be triggered only finitely often, and hence has a negligible effect on the asymptotic performance of the controller. The proposed controller is also validated via simulation on Tennessee Eastman process (TEP), a commonly used industrial process example.","PeriodicalId":13201,"journal":{"name":"IEEE Transactions on Automatic Control","volume":"70 8","pages":"5145-5159"},"PeriodicalIF":7.0000,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Almost Surely $\\\\sqrt{T}$ Regret for Adaptive LQR\",\"authors\":\"Yiwen Lu;Yilin Mo\",\"doi\":\"10.1109/TAC.2025.3535997\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The linear-quadratic regulation (LQR) problem with unknown system parameters has been widely studied, but it has remained unclear whether <inline-formula><tex-math>$\\\\tilde{ \\\\mathcal {O}}(\\\\sqrt{T})$</tex-math></inline-formula> regret, which is the best known dependence on time, can be achieved almost surely. In this article, we propose an adaptive LQR controller with almost surely <inline-formula><tex-math>$\\\\tilde{ \\\\mathcal {O}}(\\\\sqrt{T})$</tex-math></inline-formula> regret upper bound. The controller features a circuit-breaking mechanism, which falls back to a known stabilizing controller when the input signal or control gain is large, thereby circumventing potential safety breach and ensuring the convergence of the system parameter estimate. Meanwhile, the circuit-breaking is shown to be triggered only finitely often, and hence has a negligible effect on the asymptotic performance of the controller. The proposed controller is also validated via simulation on Tennessee Eastman process (TEP), a commonly used industrial process example.\",\"PeriodicalId\":13201,\"journal\":{\"name\":\"IEEE Transactions on Automatic Control\",\"volume\":\"70 8\",\"pages\":\"5145-5159\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2025-01-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Automatic Control\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10857470/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automatic Control","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10857470/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
The linear-quadratic regulation (LQR) problem with unknown system parameters has been widely studied, but it has remained unclear whether $\tilde{ \mathcal {O}}(\sqrt{T})$ regret, which is the best known dependence on time, can be achieved almost surely. In this article, we propose an adaptive LQR controller with almost surely $\tilde{ \mathcal {O}}(\sqrt{T})$ regret upper bound. The controller features a circuit-breaking mechanism, which falls back to a known stabilizing controller when the input signal or control gain is large, thereby circumventing potential safety breach and ensuring the convergence of the system parameter estimate. Meanwhile, the circuit-breaking is shown to be triggered only finitely often, and hence has a negligible effect on the asymptotic performance of the controller. The proposed controller is also validated via simulation on Tennessee Eastman process (TEP), a commonly used industrial process example.
期刊介绍:
In the IEEE Transactions on Automatic Control, the IEEE Control Systems Society publishes high-quality papers on the theory, design, and applications of control engineering. Two types of contributions are regularly considered:
1) Papers: Presentation of significant research, development, or application of control concepts.
2) Technical Notes and Correspondence: Brief technical notes, comments on published areas or established control topics, corrections to papers and notes published in the Transactions.
In addition, special papers (tutorials, surveys, and perspectives on the theory and applications of control systems topics) are solicited.