基于多值决策图的高效增量计划和学习

Q1 Mathematics
Jean-Christophe Magnan, Pierre-Henri Wuillemin
{"title":"基于多值决策图的高效增量计划和学习","authors":"Jean-Christophe Magnan,&nbsp;Pierre-Henri Wuillemin","doi":"10.1016/j.jal.2016.11.032","DOIUrl":null,"url":null,"abstract":"<div><p>In the domain of decision theoretic planning, the factored framework (Factored Markov Decision Process, <span>fmdp</span>) has produced optimized algorithms using structured representations such as Decision Trees (Structured Value Iteration (<span>svi</span>), Structured Policy Iteration (<span>spi</span>)) or Algebraic Decision Diagrams (Stochastic Planning Using Decision Diagrams (<span>spudd</span>)). Since it may be difficult to elaborate the factored models used by these algorithms, the architecture <span>sdyna</span>, which combines learning and planning algorithms using structured representations, was introduced. However, the state-of-the-art algorithms for incremental learning, for structured decision theoretic planning or for reinforcement learning require the problem to be specified only with binary variables and/or use data structures that can be improved in term of compactness. In this paper, we propose to use Multi-Valued Decision Diagrams (<span>mdd</span>s) as a more efficient data structure for the <span>sdyna</span> architecture and describe a planning algorithm and an incremental learning algorithm dedicated to this new structured representation. For both planning and learning algorithms, we experimentally show that they allow significant improvements in time, in compactness of the computed policy and of the learned model. We then analyzed the combination of these two algorithms in an efficient <span>sdyna</span> instance for simultaneous learning and planning using <span>mdd</span>s.</p></div>","PeriodicalId":54881,"journal":{"name":"Journal of Applied Logic","volume":"22 ","pages":"Pages 63-90"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.jal.2016.11.032","citationCount":"1","resultStr":"{\"title\":\"Efficient incremental planning and learning with multi-valued decision diagrams\",\"authors\":\"Jean-Christophe Magnan,&nbsp;Pierre-Henri Wuillemin\",\"doi\":\"10.1016/j.jal.2016.11.032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In the domain of decision theoretic planning, the factored framework (Factored Markov Decision Process, <span>fmdp</span>) has produced optimized algorithms using structured representations such as Decision Trees (Structured Value Iteration (<span>svi</span>), Structured Policy Iteration (<span>spi</span>)) or Algebraic Decision Diagrams (Stochastic Planning Using Decision Diagrams (<span>spudd</span>)). Since it may be difficult to elaborate the factored models used by these algorithms, the architecture <span>sdyna</span>, which combines learning and planning algorithms using structured representations, was introduced. However, the state-of-the-art algorithms for incremental learning, for structured decision theoretic planning or for reinforcement learning require the problem to be specified only with binary variables and/or use data structures that can be improved in term of compactness. In this paper, we propose to use Multi-Valued Decision Diagrams (<span>mdd</span>s) as a more efficient data structure for the <span>sdyna</span> architecture and describe a planning algorithm and an incremental learning algorithm dedicated to this new structured representation. For both planning and learning algorithms, we experimentally show that they allow significant improvements in time, in compactness of the computed policy and of the learned model. We then analyzed the combination of these two algorithms in an efficient <span>sdyna</span> instance for simultaneous learning and planning using <span>mdd</span>s.</p></div>\",\"PeriodicalId\":54881,\"journal\":{\"name\":\"Journal of Applied Logic\",\"volume\":\"22 \",\"pages\":\"Pages 63-90\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/j.jal.2016.11.032\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Applied Logic\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S157086831630091X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Logic","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S157086831630091X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 1

摘要

在决策理论规划领域,因式框架(因式马尔可夫决策过程,fmdp)使用结构化表示产生优化算法,如决策树(结构化值迭代(svi),结构化策略迭代(spi))或代数决策图(使用决策图的随机规划(spudd))。由于这些算法使用的因子模型可能难以详细描述,因此引入了使用结构化表示将学习和规划算法结合在一起的架构sdyna。然而,用于增量学习、结构化决策理论规划或强化学习的最先进算法要求只使用二进制变量和/或使用可以在紧凑性方面得到改进的数据结构来指定问题。在本文中,我们建议使用多值决策图(mdd)作为sdyna体系结构的更有效的数据结构,并描述了一种专门用于这种新的结构化表示的规划算法和增量学习算法。对于规划和学习算法,我们的实验表明,它们在时间、计算策略和学习模型的紧凑性方面都有显著的改进。然后,我们在一个高效的sdyna实例中分析了这两种算法的组合,以使用mdd进行同步学习和规划。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Efficient incremental planning and learning with multi-valued decision diagrams

In the domain of decision theoretic planning, the factored framework (Factored Markov Decision Process, fmdp) has produced optimized algorithms using structured representations such as Decision Trees (Structured Value Iteration (svi), Structured Policy Iteration (spi)) or Algebraic Decision Diagrams (Stochastic Planning Using Decision Diagrams (spudd)). Since it may be difficult to elaborate the factored models used by these algorithms, the architecture sdyna, which combines learning and planning algorithms using structured representations, was introduced. However, the state-of-the-art algorithms for incremental learning, for structured decision theoretic planning or for reinforcement learning require the problem to be specified only with binary variables and/or use data structures that can be improved in term of compactness. In this paper, we propose to use Multi-Valued Decision Diagrams (mdds) as a more efficient data structure for the sdyna architecture and describe a planning algorithm and an incremental learning algorithm dedicated to this new structured representation. For both planning and learning algorithms, we experimentally show that they allow significant improvements in time, in compactness of the computed policy and of the learned model. We then analyzed the combination of these two algorithms in an efficient sdyna instance for simultaneous learning and planning using mdds.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Applied Logic
Journal of Applied Logic COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, THEORY & METHODS
CiteScore
1.13
自引率
0.00%
发文量
0
审稿时长
>12 weeks
期刊介绍: Cessation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信