Open Problem—Convergence and Asymptotic Optimality of the Relative Value Iteration in Ergodic Control

Q1 Mathematics

Stochastic Systems Pub Date : 2019-09-17 DOI:10.1287/stsy.2019.0040

A. Arapostathis

引用次数: 2

Abstract

The relative value iteration scheme (RVI) for Markov decision processes (MDP) dates back to White (1963), a seminal work, which introduced an algorithm for solving the ergodic dynamic programming e...

查看原文本刊更多论文

开放问题——遍历控制中相对值迭代的收敛性和渐近最优性

马尔可夫决策过程(MDP)的相对值迭代方案(RVI)可以追溯到White(1963)，这是一项开创性的工作，该工作引入了一种求解遍历动态规划问题的算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Stochastic Systems Decision Sciences-Statistics, Probability and Uncertainty

CiteScore

3.70

自引率

0.00%

发文量