Celine Kessler, L. Capocchi, J. Santucci, B. Zeigler
{"title":"Hierarchical Markov decision process based on DEVS formalism","authors":"Celine Kessler, L. Capocchi, J. Santucci, B. Zeigler","doi":"10.1109/WSC.2017.8247850","DOIUrl":null,"url":null,"abstract":"Markov decision processes (MDPs) have proven useful as models of stochastic planning and decision problems. To try to propose practical implementation of MDPs, hierarchical methods are often used in MDPs or reinforcement learning to delegate the optimization of the total problem to simpler hierarchical sub-problems. The goal of the paper is to propose a generic discrete-event based software Framework allowing to use hierarchical MDPs and reinforcement learning to solve planning or decision problems. The proposed approach has been validated using the “grid world” typical MDP use case.","PeriodicalId":145780,"journal":{"name":"2017 Winter Simulation Conference (WSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Winter Simulation Conference (WSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WSC.2017.8247850","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Markov decision processes (MDPs) have proven useful as models of stochastic planning and decision problems. To try to propose practical implementation of MDPs, hierarchical methods are often used in MDPs or reinforcement learning to delegate the optimization of the total problem to simpler hierarchical sub-problems. The goal of the paper is to propose a generic discrete-event based software Framework allowing to use hierarchical MDPs and reinforcement learning to solve planning or decision problems. The proposed approach has been validated using the “grid world” typical MDP use case.