{"title":"切换线性系统的无模型强化学习:一种子空间聚类方法","authors":"Hao Li, Hua Chen, Wei Zhang","doi":"10.1109/ALLERTON.2018.8635985","DOIUrl":null,"url":null,"abstract":"In this paper, we study optimal control of switched linear systems using reinforcement learning. Instead of directly applying existing model-free reinforcement learning algorithms, we propose a Q-learning-based algorithm designed specifically for discrete time switched linear systems. Inspired by the analytical results from optimal control literature, the Q function in our algorithm is approximated by a point-wise minimum form of a finite number of quadratic functions. An associated update scheme based on subspace clustering for such an approximation is also developed which preserves the desired structure during the training process. Numerical examples for both low-dimensional and high-dimensional switched linear systems are provided to demonstrate the performance of our algorithm.","PeriodicalId":299280,"journal":{"name":"2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"On Model-free Reinforcement Learning for Switched Linear Systems: A Subspace Clustering Approach\",\"authors\":\"Hao Li, Hua Chen, Wei Zhang\",\"doi\":\"10.1109/ALLERTON.2018.8635985\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we study optimal control of switched linear systems using reinforcement learning. Instead of directly applying existing model-free reinforcement learning algorithms, we propose a Q-learning-based algorithm designed specifically for discrete time switched linear systems. Inspired by the analytical results from optimal control literature, the Q function in our algorithm is approximated by a point-wise minimum form of a finite number of quadratic functions. An associated update scheme based on subspace clustering for such an approximation is also developed which preserves the desired structure during the training process. Numerical examples for both low-dimensional and high-dimensional switched linear systems are provided to demonstrate the performance of our algorithm.\",\"PeriodicalId\":299280,\"journal\":{\"name\":\"2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ALLERTON.2018.8635985\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2018.8635985","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On Model-free Reinforcement Learning for Switched Linear Systems: A Subspace Clustering Approach
In this paper, we study optimal control of switched linear systems using reinforcement learning. Instead of directly applying existing model-free reinforcement learning algorithms, we propose a Q-learning-based algorithm designed specifically for discrete time switched linear systems. Inspired by the analytical results from optimal control literature, the Q function in our algorithm is approximated by a point-wise minimum form of a finite number of quadratic functions. An associated update scheme based on subspace clustering for such an approximation is also developed which preserves the desired structure during the training process. Numerical examples for both low-dimensional and high-dimensional switched linear systems are provided to demonstrate the performance of our algorithm.