{"title":"Mobile robotics planning using abstract Markov decision processes","authors":"Pierre Laroche, F. Charpillet, R. Schott","doi":"10.1109/TAI.1999.809804","DOIUrl":null,"url":null,"abstract":"Markov decision processes have been successfully used in robotics for indoor robot navigation problems. They allow the computation of optimal sequences of actions in order to achieve a given goal, accounting for actuator uncertainties. However, MDPs are unsatisfactory at avoiding unknown obstacles. On the other hand, reactive navigators are particularly adapted to that, and don't need any prior knowledge about the environment, but they are unable to plan the set of actions that will permit the realization of a given mission. We present a new state aggregation technique for Markov decision processes, such that part of the work usually dedicated to the planner is achieved by a reactive navigator. Thus some characteristics of our environments, such as the width of corridors, have not been considered, which allows to cluster states together, significantly reducing the state space. As a consequence, policies are computed faster and are shown to be at least as efficient as optimal ones.","PeriodicalId":194023,"journal":{"name":"Proceedings 11th International Conference on Tools with Artificial Intelligence","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 11th International Conference on Tools with Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TAI.1999.809804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Markov decision processes have been successfully used in robotics for indoor robot navigation problems. They allow the computation of optimal sequences of actions in order to achieve a given goal, accounting for actuator uncertainties. However, MDPs are unsatisfactory at avoiding unknown obstacles. On the other hand, reactive navigators are particularly adapted to that, and don't need any prior knowledge about the environment, but they are unable to plan the set of actions that will permit the realization of a given mission. We present a new state aggregation technique for Markov decision processes, such that part of the work usually dedicated to the planner is achieved by a reactive navigator. Thus some characteristics of our environments, such as the width of corridors, have not been considered, which allows to cluster states together, significantly reducing the state space. As a consequence, policies are computed faster and are shown to be at least as efficient as optimal ones.