基于轨迹预测与跟踪的在线LQR控制的后悔分析:扩展版

Yitian Chen, Timothy L. Molloy, T. Summers, I. Shames
{"title":"基于轨迹预测与跟踪的在线LQR控制的后悔分析:扩展版","authors":"Yitian Chen, Timothy L. Molloy, T. Summers, I. Shames","doi":"10.48550/arXiv.2302.10411","DOIUrl":null,"url":null,"abstract":"In this paper, we propose and analyze a new method for online linear quadratic regulator (LQR) control with a priori unknown time-varying cost matrices. The cost matrices are revealed sequentially with the potential for future values to be previewed over a short window. Our novel method involves using the available cost matrices to predict the optimal trajectory, and a tracking controller to drive the system towards it. We adopted the notion of dynamic regret to measure the performance of this proposed online LQR control method, with our main result being that the (dynamic) regret of our method is upper bounded by a constant. Moreover, the regret upper bound decays exponentially with the preview window length, and is extendable to systems with disturbances. We show in simulations that our proposed method offers improved performance compared to other previously proposed online LQR methods.","PeriodicalId":268449,"journal":{"name":"Conference on Learning for Dynamics & Control","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Regret Analysis of Online LQR Control via Trajectory Prediction and Tracking: Extended Version\",\"authors\":\"Yitian Chen, Timothy L. Molloy, T. Summers, I. Shames\",\"doi\":\"10.48550/arXiv.2302.10411\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose and analyze a new method for online linear quadratic regulator (LQR) control with a priori unknown time-varying cost matrices. The cost matrices are revealed sequentially with the potential for future values to be previewed over a short window. Our novel method involves using the available cost matrices to predict the optimal trajectory, and a tracking controller to drive the system towards it. We adopted the notion of dynamic regret to measure the performance of this proposed online LQR control method, with our main result being that the (dynamic) regret of our method is upper bounded by a constant. Moreover, the regret upper bound decays exponentially with the preview window length, and is extendable to systems with disturbances. We show in simulations that our proposed method offers improved performance compared to other previously proposed online LQR methods.\",\"PeriodicalId\":268449,\"journal\":{\"name\":\"Conference on Learning for Dynamics & Control\",\"volume\":\"67 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference on Learning for Dynamics & Control\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2302.10411\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference on Learning for Dynamics & Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2302.10411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

本文提出并分析了一种具有先验未知时变代价矩阵的在线线性二次型调节器(LQR)控制新方法。成本矩阵按顺序显示,并在短窗口内预览未来值的可能性。我们的新方法包括使用可用的成本矩阵来预测最优轨迹,并使用跟踪控制器来驱动系统向其移动。我们采用动态后悔的概念来衡量所提出的在线LQR控制方法的性能,我们的主要结果是我们的方法的(动态)后悔的上限是一个常数。此外,遗憾上界随预览窗口长度呈指数衰减,并可推广到有扰动的系统。我们在模拟中表明,与其他先前提出的在线LQR方法相比,我们提出的方法提供了更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Regret Analysis of Online LQR Control via Trajectory Prediction and Tracking: Extended Version
In this paper, we propose and analyze a new method for online linear quadratic regulator (LQR) control with a priori unknown time-varying cost matrices. The cost matrices are revealed sequentially with the potential for future values to be previewed over a short window. Our novel method involves using the available cost matrices to predict the optimal trajectory, and a tracking controller to drive the system towards it. We adopted the notion of dynamic regret to measure the performance of this proposed online LQR control method, with our main result being that the (dynamic) regret of our method is upper bounded by a constant. Moreover, the regret upper bound decays exponentially with the preview window length, and is extendable to systems with disturbances. We show in simulations that our proposed method offers improved performance compared to other previously proposed online LQR methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信