{"title":"Towards online reachability analysis with temporal-differencing","authors":"Anayo K. Akametalu, C. Tomlin","doi":"10.1145/2728606.2728642","DOIUrl":null,"url":null,"abstract":"Hamilton-Jacobi-Isaacs (HJI) reachability analysis has been employed to guarantee constraint satisfaction (safety) in a number of applications including robotics, air traffic control, and control of HVAC systems. However, the current standard for these methods can result in overly-conservative controllers that can degrade system performance with respect to lower priority objectives. There has been interest in incorporating online machine learning techniques to reduce the conservativeness of this approach. However, recent efforts have resulted in methods that are computationally inefficient and scale poorly with the dimension of the state space. We explore a novel online reachability update algorithm based on temporal-difference learning that is computationally more efficient than current methods. Our algorithm is demonstrated on a simulation of a quadrotor learning to track a trajectory in a confined space and a reach-avoid/pursuit-evader game.","PeriodicalId":377654,"journal":{"name":"Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2728606.2728642","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Hamilton-Jacobi-Isaacs (HJI) reachability analysis has been employed to guarantee constraint satisfaction (safety) in a number of applications including robotics, air traffic control, and control of HVAC systems. However, the current standard for these methods can result in overly-conservative controllers that can degrade system performance with respect to lower priority objectives. There has been interest in incorporating online machine learning techniques to reduce the conservativeness of this approach. However, recent efforts have resulted in methods that are computationally inefficient and scale poorly with the dimension of the state space. We explore a novel online reachability update algorithm based on temporal-difference learning that is computationally more efficient than current methods. Our algorithm is demonstrated on a simulation of a quadrotor learning to track a trajectory in a confined space and a reach-avoid/pursuit-evader game.