Rationally inattentive Markov decision processes over a finite horizon

Ehsan Shafieepoorfard, M. Raginsky
{"title":"Rationally inattentive Markov decision processes over a finite horizon","authors":"Ehsan Shafieepoorfard, M. Raginsky","doi":"10.1109/ACSSC.2017.8335416","DOIUrl":null,"url":null,"abstract":"The framework of Rationally Inattentive Markov Decision Processes (RIMDPs) is an extension of Partially Observable Markov Decision Processes (POMDP) to the case when the observation kernel that governs the information gathering process is also selected by the decision maker. At each time, an observation kernel is chosen subject to a constraint on the Shannon conditional mutual information between the history of states and the current observation given the history of past observations. This set-up naturally arises in the context of networked control systems, artificial intelligence, and economic decision-making by boundedly rational agents. We show that, under certain structural assumptions on the information pattern and on the optimal policy, Bellman's Principle of Optimality can be used to derive a general dynamic programming recursion for this problem that reduces to solving a sequence of conditional rate-distortion problems.","PeriodicalId":296208,"journal":{"name":"2017 51st Asilomar Conference on Signals, Systems, and Computers","volume":"92 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 51st Asilomar Conference on Signals, Systems, and Computers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSSC.2017.8335416","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The framework of Rationally Inattentive Markov Decision Processes (RIMDPs) is an extension of Partially Observable Markov Decision Processes (POMDP) to the case when the observation kernel that governs the information gathering process is also selected by the decision maker. At each time, an observation kernel is chosen subject to a constraint on the Shannon conditional mutual information between the history of states and the current observation given the history of past observations. This set-up naturally arises in the context of networked control systems, artificial intelligence, and economic decision-making by boundedly rational agents. We show that, under certain structural assumptions on the information pattern and on the optimal policy, Bellman's Principle of Optimality can be used to derive a general dynamic programming recursion for this problem that reduces to solving a sequence of conditional rate-distortion problems.
在有限的视界上理性的不注意的马尔可夫决策过程
理性不注意马尔可夫决策过程(rimdp)框架是部分可观察马尔可夫决策过程(POMDP)的扩展,在这种情况下,管理信息收集过程的观察核也由决策者选择。在给定过去观测历史的条件下,根据状态历史和当前观测之间的香农条件互信息的约束,每次选择一个观测核。这种设置自然出现在网络控制系统、人工智能和有限理性主体的经济决策的背景下。我们证明,在信息模式和最优策略的某些结构假设下,Bellman最优性原理可用于导出该问题的一般动态规划递归,该递归可简化为求解一系列条件率失真问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信