{"title":"基于时间差的在线可达性分析","authors":"Anayo K. Akametalu, C. Tomlin","doi":"10.1145/2728606.2728642","DOIUrl":null,"url":null,"abstract":"Hamilton-Jacobi-Isaacs (HJI) reachability analysis has been employed to guarantee constraint satisfaction (safety) in a number of applications including robotics, air traffic control, and control of HVAC systems. However, the current standard for these methods can result in overly-conservative controllers that can degrade system performance with respect to lower priority objectives. There has been interest in incorporating online machine learning techniques to reduce the conservativeness of this approach. However, recent efforts have resulted in methods that are computationally inefficient and scale poorly with the dimension of the state space. We explore a novel online reachability update algorithm based on temporal-difference learning that is computationally more efficient than current methods. Our algorithm is demonstrated on a simulation of a quadrotor learning to track a trajectory in a confined space and a reach-avoid/pursuit-evader game.","PeriodicalId":377654,"journal":{"name":"Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards online reachability analysis with temporal-differencing\",\"authors\":\"Anayo K. Akametalu, C. Tomlin\",\"doi\":\"10.1145/2728606.2728642\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hamilton-Jacobi-Isaacs (HJI) reachability analysis has been employed to guarantee constraint satisfaction (safety) in a number of applications including robotics, air traffic control, and control of HVAC systems. However, the current standard for these methods can result in overly-conservative controllers that can degrade system performance with respect to lower priority objectives. There has been interest in incorporating online machine learning techniques to reduce the conservativeness of this approach. However, recent efforts have resulted in methods that are computationally inefficient and scale poorly with the dimension of the state space. We explore a novel online reachability update algorithm based on temporal-difference learning that is computationally more efficient than current methods. Our algorithm is demonstrated on a simulation of a quadrotor learning to track a trajectory in a confined space and a reach-avoid/pursuit-evader game.\",\"PeriodicalId\":377654,\"journal\":{\"name\":\"Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2728606.2728642\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2728606.2728642","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards online reachability analysis with temporal-differencing
Hamilton-Jacobi-Isaacs (HJI) reachability analysis has been employed to guarantee constraint satisfaction (safety) in a number of applications including robotics, air traffic control, and control of HVAC systems. However, the current standard for these methods can result in overly-conservative controllers that can degrade system performance with respect to lower priority objectives. There has been interest in incorporating online machine learning techniques to reduce the conservativeness of this approach. However, recent efforts have resulted in methods that are computationally inefficient and scale poorly with the dimension of the state space. We explore a novel online reachability update algorithm based on temporal-difference learning that is computationally more efficient than current methods. Our algorithm is demonstrated on a simulation of a quadrotor learning to track a trajectory in a confined space and a reach-avoid/pursuit-evader game.