Rationally inattentive Markov decision processes over a finite horizon

2017 51st Asilomar Conference on Signals, Systems, and Computers Pub Date : 2017-10-01 DOI:10.1109/ACSSC.2017.8335416

Ehsan Shafieepoorfard, M. Raginsky

引用次数: 0

Abstract

The framework of Rationally Inattentive Markov Decision Processes (RIMDPs) is an extension of Partially Observable Markov Decision Processes (POMDP) to the case when the observation kernel that governs the information gathering process is also selected by the decision maker. At each time, an observation kernel is chosen subject to a constraint on the Shannon conditional mutual information between the history of states and the current observation given the history of past observations. This set-up naturally arises in the context of networked control systems, artificial intelligence, and economic decision-making by boundedly rational agents. We show that, under certain structural assumptions on the information pattern and on the optimal policy, Bellman's Principle of Optimality can be used to derive a general dynamic programming recursion for this problem that reduces to solving a sequence of conditional rate-distortion problems.

查看原文本刊更多论文

在有限的视界上理性的不注意的马尔可夫决策过程

理性不注意马尔可夫决策过程(rimdp)框架是部分可观察马尔可夫决策过程(POMDP)的扩展，在这种情况下，管理信息收集过程的观察核也由决策者选择。在给定过去观测历史的条件下，根据状态历史和当前观测之间的香农条件互信息的约束，每次选择一个观测核。这种设置自然出现在网络控制系统、人工智能和有限理性主体的经济决策的背景下。我们证明，在信息模式和最优策略的某些结构假设下，Bellman最优性原理可用于导出该问题的一般动态规划递归，该递归可简化为求解一系列条件率失真问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 51st Asilomar Conference on Signals, Systems, and Computers

自引率

0.00%

发文量