Undiscounted two-person zero-sum communicating stochastic games

Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304) Pub Date : 1999-12-07 DOI:10.1109/CDC.1999.832844

M. Baykal-Gursoy, Z. Avsar

引用次数: 2

Abstract

Consider two-person zero-sum communicating stochastic games with finite state and finite action spaces under the long-run average payoff criterion. A communicating game is irreducible on a restricted strategy space where every pair of action is taken with positive probability. The proposed approach applies Hoffman and Karp's (1996) algorithm for irreducible games successively over a sequence of restricted strategy spaces that gets larger until an /spl epsiv/-optimal stationary policy pair is obtained for any /spl epsiv/>0. This algorithm is convergent for the games that have optimal strategies with a value independent of the initial state.

查看原文本刊更多论文

未折现的二人零和交流随机博弈

考虑在长期平均收益标准下，具有有限状态和有限行动空间的两人零和交流随机博弈。交流博弈在有限的策略空间中是不可约的，其中每一对行动都是正概率的。所提出的方法将Hoffman和Karp(1996)的算法应用于不可约博弈，在一系列受限策略空间上连续变大，直到获得任何/spl epsiv/>0的/spl epsiv/-最优平稳策略对。对于具有独立于初始状态的最优策略的博弈，该算法是收敛的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304)

自引率

0.00%

发文量