{"title":"LP formulation of asymmetric zero-sum stochastic games","authors":"Lichun Li, J. Shamma","doi":"10.1109/CDC.2014.7039680","DOIUrl":null,"url":null,"abstract":"This paper provides an efficient linear programming (LP) formulation of asymmetric two player zero-sum stochastic games with finite horizon. In these stochastic games, only one player is informed of the state at each stage, and the transition law is only controlled by the informed player. Compared with the LP formulation of extensive stochastic games whose size grows polynomially with respect to the size of the state and the size of the uninformed player's actions, our proposed LP formulation has its size to be linear with respect to the size of the state and the size of the uninformed player, and hence greatly reduces the computational complexity. A travelling inspector problem is used to demonstrate the efficiency of the proposed LP formulation.","PeriodicalId":202708,"journal":{"name":"53rd IEEE Conference on Decision and Control","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"53rd IEEE Conference on Decision and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDC.2014.7039680","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 29
Abstract
This paper provides an efficient linear programming (LP) formulation of asymmetric two player zero-sum stochastic games with finite horizon. In these stochastic games, only one player is informed of the state at each stage, and the transition law is only controlled by the informed player. Compared with the LP formulation of extensive stochastic games whose size grows polynomially with respect to the size of the state and the size of the uninformed player's actions, our proposed LP formulation has its size to be linear with respect to the size of the state and the size of the uninformed player, and hence greatly reduces the computational complexity. A travelling inspector problem is used to demonstrate the efficiency of the proposed LP formulation.