Francesco Esposito, Christian Pek, Michael C. Welle, D. Kragic
{"title":"从演示中学习视觉行动计划中的任务约束","authors":"Francesco Esposito, Christian Pek, Michael C. Welle, D. Kragic","doi":"10.1109/RO-MAN50785.2021.9515548","DOIUrl":null,"url":null,"abstract":"Visual planning approaches have shown great success for decision making tasks with no explicit model of the state space. Learning a suitable representation and constructing a latent space where planning can be performed allows non-experts to setup and plan motions by just providing images. However, learned latent spaces are usually not semantically-interpretable, and thus it is difficult to integrate task constraints. We propose a novel framework to determine whether plans satisfy constraints given demonstrations of policies that satisfy or violate the constraints. The demonstrations are realizations of Linear Temporal Logic formulas which are employed to train Long Short-Term Memory (LSTM) networks directly in the latent space representation. We demonstrate that our architecture enables designers to easily specify, compose and integrate task constraints and achieves high performance in terms of accuracy. Furthermore, this visual planning framework enables human interaction, coping the environment changes that a human worker may involve. We show the flexibility of the method on a box pushing task in a simulated warehouse setting with different task constraints.","PeriodicalId":6854,"journal":{"name":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","volume":"32 1","pages":"131-138"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Learning Task Constraints in Visual-Action Planning from Demonstrations\",\"authors\":\"Francesco Esposito, Christian Pek, Michael C. Welle, D. Kragic\",\"doi\":\"10.1109/RO-MAN50785.2021.9515548\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Visual planning approaches have shown great success for decision making tasks with no explicit model of the state space. Learning a suitable representation and constructing a latent space where planning can be performed allows non-experts to setup and plan motions by just providing images. However, learned latent spaces are usually not semantically-interpretable, and thus it is difficult to integrate task constraints. We propose a novel framework to determine whether plans satisfy constraints given demonstrations of policies that satisfy or violate the constraints. The demonstrations are realizations of Linear Temporal Logic formulas which are employed to train Long Short-Term Memory (LSTM) networks directly in the latent space representation. We demonstrate that our architecture enables designers to easily specify, compose and integrate task constraints and achieves high performance in terms of accuracy. Furthermore, this visual planning framework enables human interaction, coping the environment changes that a human worker may involve. We show the flexibility of the method on a box pushing task in a simulated warehouse setting with different task constraints.\",\"PeriodicalId\":6854,\"journal\":{\"name\":\"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)\",\"volume\":\"32 1\",\"pages\":\"131-138\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RO-MAN50785.2021.9515548\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RO-MAN50785.2021.9515548","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Learning Task Constraints in Visual-Action Planning from Demonstrations
Visual planning approaches have shown great success for decision making tasks with no explicit model of the state space. Learning a suitable representation and constructing a latent space where planning can be performed allows non-experts to setup and plan motions by just providing images. However, learned latent spaces are usually not semantically-interpretable, and thus it is difficult to integrate task constraints. We propose a novel framework to determine whether plans satisfy constraints given demonstrations of policies that satisfy or violate the constraints. The demonstrations are realizations of Linear Temporal Logic formulas which are employed to train Long Short-Term Memory (LSTM) networks directly in the latent space representation. We demonstrate that our architecture enables designers to easily specify, compose and integrate task constraints and achieves high performance in terms of accuracy. Furthermore, this visual planning framework enables human interaction, coping the environment changes that a human worker may involve. We show the flexibility of the method on a box pushing task in a simulated warehouse setting with different task constraints.