Kaylene C. Stocking, D. McPherson, R. Matthew, C. Tomlin
{"title":"连续状态空间的极大似然约束推理","authors":"Kaylene C. Stocking, D. McPherson, R. Matthew, C. Tomlin","doi":"10.1109/icra46639.2022.9811705","DOIUrl":null,"url":null,"abstract":"When a robot observes another agent unexpectedly modifying their behavior, inferring the most likely cause is a valuable tool for maintaining safety and reacting appropriately. In this work, we present a novel method for inferring constraints that works on continuous, possibly sub-optimal demonstrations. We first learn a representation of the continuous-state maximum entropy trajectory distribution using deep reinforcement learning. We then use Monte Carlo sampling from this distribution to generate expected constraint violation probabilities and perform constraint inference. When the demonstrator's dynamics and objective function are known in advance, this process can be performed offline, allowing for real-time constraint inference at the moment demonstrations are observed. We evaluate our approach on two continuous dynamical systems: a 2-dimensional inverted pendulum model, and a 4-dimensional unicycle model that was successfully used for fast constraint inference on a 1/10 scale car remote-controlled by a human.","PeriodicalId":341244,"journal":{"name":"2022 International Conference on Robotics and Automation (ICRA)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Maximum Likelihood Constraint Inference on Continuous State Spaces\",\"authors\":\"Kaylene C. Stocking, D. McPherson, R. Matthew, C. Tomlin\",\"doi\":\"10.1109/icra46639.2022.9811705\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When a robot observes another agent unexpectedly modifying their behavior, inferring the most likely cause is a valuable tool for maintaining safety and reacting appropriately. In this work, we present a novel method for inferring constraints that works on continuous, possibly sub-optimal demonstrations. We first learn a representation of the continuous-state maximum entropy trajectory distribution using deep reinforcement learning. We then use Monte Carlo sampling from this distribution to generate expected constraint violation probabilities and perform constraint inference. When the demonstrator's dynamics and objective function are known in advance, this process can be performed offline, allowing for real-time constraint inference at the moment demonstrations are observed. We evaluate our approach on two continuous dynamical systems: a 2-dimensional inverted pendulum model, and a 4-dimensional unicycle model that was successfully used for fast constraint inference on a 1/10 scale car remote-controlled by a human.\",\"PeriodicalId\":341244,\"journal\":{\"name\":\"2022 International Conference on Robotics and Automation (ICRA)\",\"volume\":\"75 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Robotics and Automation (ICRA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/icra46639.2022.9811705\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icra46639.2022.9811705","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Maximum Likelihood Constraint Inference on Continuous State Spaces
When a robot observes another agent unexpectedly modifying their behavior, inferring the most likely cause is a valuable tool for maintaining safety and reacting appropriately. In this work, we present a novel method for inferring constraints that works on continuous, possibly sub-optimal demonstrations. We first learn a representation of the continuous-state maximum entropy trajectory distribution using deep reinforcement learning. We then use Monte Carlo sampling from this distribution to generate expected constraint violation probabilities and perform constraint inference. When the demonstrator's dynamics and objective function are known in advance, this process can be performed offline, allowing for real-time constraint inference at the moment demonstrations are observed. We evaluate our approach on two continuous dynamical systems: a 2-dimensional inverted pendulum model, and a 4-dimensional unicycle model that was successfully used for fast constraint inference on a 1/10 scale car remote-controlled by a human.