基于DEVS形式化的层次马尔可夫决策过程

2017 Winter Simulation Conference (WSC) Pub Date : 2017-12-03 DOI:10.1109/WSC.2017.8247850

Celine Kessler, L. Capocchi, J. Santucci, B. Zeigler

{"title":"基于DEVS形式化的层次马尔可夫决策过程","authors":"Celine Kessler, L. Capocchi, J. Santucci, B. Zeigler","doi":"10.1109/WSC.2017.8247850","DOIUrl":null,"url":null,"abstract":"Markov decision processes (MDPs) have proven useful as models of stochastic planning and decision problems. To try to propose practical implementation of MDPs, hierarchical methods are often used in MDPs or reinforcement learning to delegate the optimization of the total problem to simpler hierarchical sub-problems. The goal of the paper is to propose a generic discrete-event based software Framework allowing to use hierarchical MDPs and reinforcement learning to solve planning or decision problems. The proposed approach has been validated using the “grid world” typical MDP use case.","PeriodicalId":145780,"journal":{"name":"2017 Winter Simulation Conference (WSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Hierarchical Markov decision process based on DEVS formalism\",\"authors\":\"Celine Kessler, L. Capocchi, J. Santucci, B. Zeigler\",\"doi\":\"10.1109/WSC.2017.8247850\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Markov decision processes (MDPs) have proven useful as models of stochastic planning and decision problems. To try to propose practical implementation of MDPs, hierarchical methods are often used in MDPs or reinforcement learning to delegate the optimization of the total problem to simpler hierarchical sub-problems. The goal of the paper is to propose a generic discrete-event based software Framework allowing to use hierarchical MDPs and reinforcement learning to solve planning or decision problems. The proposed approach has been validated using the “grid world” typical MDP use case.\",\"PeriodicalId\":145780,\"journal\":{\"name\":\"2017 Winter Simulation Conference (WSC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 Winter Simulation Conference (WSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WSC.2017.8247850\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Winter Simulation Conference (WSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WSC.2017.8247850","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

马尔可夫决策过程(mdp)作为随机规划和决策问题的模型已被证明是有用的。为了尝试提出mdp的实际实现，在mdp或强化学习中经常使用分层方法，将总体问题的优化委托给更简单的分层子问题。本文的目标是提出一个通用的基于离散事件的软件框架，允许使用分层mdp和强化学习来解决规划或决策问题。所建议的方法已经使用“网格世界”典型的MDP用例进行了验证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Hierarchical Markov decision process based on DEVS formalism

Markov decision processes (MDPs) have proven useful as models of stochastic planning and decision problems. To try to propose practical implementation of MDPs, hierarchical methods are often used in MDPs or reinforcement learning to delegate the optimization of the total problem to simpler hierarchical sub-problems. The goal of the paper is to propose a generic discrete-event based software Framework allowing to use hierarchical MDPs and reinforcement learning to solve planning or decision problems. The proposed approach has been validated using the “grid world” typical MDP use case.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 Winter Simulation Conference (WSC)

自引率

0.00%

发文量