{"title":"Learning Based Optimal Sensor Selection for Linear Quadratic Control with Unknown Sensor Noise Covariance *","authors":"Jinna Li, Xinru Wang, Xiangyu Meng","doi":"10.23919/ACC55779.2023.10156247","DOIUrl":null,"url":null,"abstract":"In this article, an optimal sensor selection problem is considered under the framework of linear quadratic control. The objective is to find the best strategy of selecting one sensor among a set of sensors at each time step so that the expected system performance is minimized over multiple time steps. This problem is formulated as a multi-armed bandit problem. Uncertainties are captured through noisy sensor measurements, which account for the performance deterioration caused by unknown sensor noise covariance. In this context, several action-value based reinforcement learning methods are proposed to evaluate the performance of different sensor selection strategies. Moreover, a statistical method is developed to estimate the unknown sensor noise covariance as a byproduct. The almost sure convergence to the true sensor noise covariance is guaranteed as the number of times a sensor being selected goes to infinity. A linear quadratic control example is presented to illustrate the proposed approaches and to demonstrate their effectiveness.","PeriodicalId":397401,"journal":{"name":"2023 American Control Conference (ACC)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 American Control Conference (ACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ACC55779.2023.10156247","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this article, an optimal sensor selection problem is considered under the framework of linear quadratic control. The objective is to find the best strategy of selecting one sensor among a set of sensors at each time step so that the expected system performance is minimized over multiple time steps. This problem is formulated as a multi-armed bandit problem. Uncertainties are captured through noisy sensor measurements, which account for the performance deterioration caused by unknown sensor noise covariance. In this context, several action-value based reinforcement learning methods are proposed to evaluate the performance of different sensor selection strategies. Moreover, a statistical method is developed to estimate the unknown sensor noise covariance as a byproduct. The almost sure convergence to the true sensor noise covariance is guaranteed as the number of times a sensor being selected goes to infinity. A linear quadratic control example is presented to illustrate the proposed approaches and to demonstrate their effectiveness.