{"title":"输出反馈合成轨道几何:矩阵和 LQG 直接策略优化","authors":"Spencer Kraisler;Mehran Mesbahi","doi":"10.1109/LCSYS.2024.3414962","DOIUrl":null,"url":null,"abstract":"We consider direct policy optimization for the linear-quadratic Gaussian (LQG) setting. Over the past few years, it has been recognized that the landscape of dynamic output-feedback controllers of relevance to LQG has an intricate geometry, particularly pertaining to the existence of degenerate stationary points, that hinders gradient methods. In order to address these challenges, in this letter, we adopt a system-theoretic coordinate-invariant Riemannian metric for the space of dynamic output-feedback controllers and develop a Riemannian gradient descent for direct LQG policy optimization. We then proceed to prove that the orbit space of such controllers, modulo the coordinate transformation, admits a Riemannian quotient manifold structure. This geometric structure-that is of independent interest-provides an effective approach to derive direct policy optimization algorithms for LQG with a local linear rate convergence guarantee. Subsequently, we show that the proposed approach exhibits significantly faster and more robust numerical performance as compared with ordinary gradient descent.","PeriodicalId":37235,"journal":{"name":"IEEE Control Systems Letters","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Output-Feedback Synthesis Orbit Geometry: Quotient Manifolds and LQG Direct Policy Optimization\",\"authors\":\"Spencer Kraisler;Mehran Mesbahi\",\"doi\":\"10.1109/LCSYS.2024.3414962\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider direct policy optimization for the linear-quadratic Gaussian (LQG) setting. Over the past few years, it has been recognized that the landscape of dynamic output-feedback controllers of relevance to LQG has an intricate geometry, particularly pertaining to the existence of degenerate stationary points, that hinders gradient methods. In order to address these challenges, in this letter, we adopt a system-theoretic coordinate-invariant Riemannian metric for the space of dynamic output-feedback controllers and develop a Riemannian gradient descent for direct LQG policy optimization. We then proceed to prove that the orbit space of such controllers, modulo the coordinate transformation, admits a Riemannian quotient manifold structure. This geometric structure-that is of independent interest-provides an effective approach to derive direct policy optimization algorithms for LQG with a local linear rate convergence guarantee. Subsequently, we show that the proposed approach exhibits significantly faster and more robust numerical performance as compared with ordinary gradient descent.\",\"PeriodicalId\":37235,\"journal\":{\"name\":\"IEEE Control Systems Letters\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2024-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Control Systems Letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10557741/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Control Systems Letters","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10557741/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Output-Feedback Synthesis Orbit Geometry: Quotient Manifolds and LQG Direct Policy Optimization
We consider direct policy optimization for the linear-quadratic Gaussian (LQG) setting. Over the past few years, it has been recognized that the landscape of dynamic output-feedback controllers of relevance to LQG has an intricate geometry, particularly pertaining to the existence of degenerate stationary points, that hinders gradient methods. In order to address these challenges, in this letter, we adopt a system-theoretic coordinate-invariant Riemannian metric for the space of dynamic output-feedback controllers and develop a Riemannian gradient descent for direct LQG policy optimization. We then proceed to prove that the orbit space of such controllers, modulo the coordinate transformation, admits a Riemannian quotient manifold structure. This geometric structure-that is of independent interest-provides an effective approach to derive direct policy optimization algorithms for LQG with a local linear rate convergence guarantee. Subsequently, we show that the proposed approach exhibits significantly faster and more robust numerical performance as compared with ordinary gradient descent.