随机博弈的逐次逼近算法-数值比较

1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes Pub Date : 1900-01-01 DOI:10.1109/CDC.1978.268108

P. Dasgupta, W. Gruver

{"title":"随机博弈的逐次逼近算法-数值比较","authors":"P. Dasgupta, W. Gruver","doi":"10.1109/CDC.1978.268108","DOIUrl":null,"url":null,"abstract":"In this paper we treat algorithmic methods for solution of stochastic games. A value iteration method incorporating bounds in a test for suboptimality is compared with policy iteration for three types of transition probability matrices. Numerical experiments demonstrate the superiority of the value iteration technique for problems with special structure.","PeriodicalId":375119,"journal":{"name":"1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes","volume":"35 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Successive approximation algorithms for stochastic games-Numerical comparisons\",\"authors\":\"P. Dasgupta, W. Gruver\",\"doi\":\"10.1109/CDC.1978.268108\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we treat algorithmic methods for solution of stochastic games. A value iteration method incorporating bounds in a test for suboptimality is compared with policy iteration for three types of transition probability matrices. Numerical experiments demonstrate the superiority of the value iteration technique for problems with special structure.\",\"PeriodicalId\":375119,\"journal\":{\"name\":\"1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes\",\"volume\":\"35 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CDC.1978.268108\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDC.1978.268108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文讨论了求解随机对策的算法方法。针对三种转移概率矩阵，将一种包含次优性检验界的值迭代方法与策略迭代方法进行了比较。数值实验证明了数值迭代技术在特殊结构问题求解中的优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Successive approximation algorithms for stochastic games-Numerical comparisons

In this paper we treat algorithmic methods for solution of stochastic games. A value iteration method incorporating bounds in a test for suboptimality is compared with policy iteration for three types of transition probability matrices. Numerical experiments demonstrate the superiority of the value iteration technique for problems with special structure.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes

自引率

0.00%

发文量