{"title":"不确定、动态、零和博弈的强化学习算法","authors":"S. Mukhopadhyay, Omkar J. Tilak, S. Chakrabarti","doi":"10.1109/ICMLA.2018.00015","DOIUrl":null,"url":null,"abstract":"Dynamic zero-sum games are a model of multiagent decision-making that has been well-studied in the mathematical game theory literature. In this paper, we derive a sufficient condition for the existence of a solution to this problem, and then proceed to discuss various reinforcement learning strategies to solve such a dynamic game in the presence of uncertainty where the game matrices at various states as well as the transition probabilities between the states under different agent actions are unknown. A novel algorithm, based on heterogeneous games of learning automata (HEGLA), as well as algorithms based on model-based and model-free reinforcement learning, are presented as possible approaches to learning the solution Markov equilibrium policies when they are assumed to satisfy the sufficient conditions for existence. The HEGLA algorithm involves automata simultaneously playing zero-sum games with some automata and identical pay-off games with some other automata. Simulation studies are reported to complement the theoretical and algorithmic discussions.","PeriodicalId":6533,"journal":{"name":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"304 1","pages":"48-54"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Reinforcement Learning Algorithms for Uncertain, Dynamic, Zero-Sum Games\",\"authors\":\"S. Mukhopadhyay, Omkar J. Tilak, S. Chakrabarti\",\"doi\":\"10.1109/ICMLA.2018.00015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dynamic zero-sum games are a model of multiagent decision-making that has been well-studied in the mathematical game theory literature. In this paper, we derive a sufficient condition for the existence of a solution to this problem, and then proceed to discuss various reinforcement learning strategies to solve such a dynamic game in the presence of uncertainty where the game matrices at various states as well as the transition probabilities between the states under different agent actions are unknown. A novel algorithm, based on heterogeneous games of learning automata (HEGLA), as well as algorithms based on model-based and model-free reinforcement learning, are presented as possible approaches to learning the solution Markov equilibrium policies when they are assumed to satisfy the sufficient conditions for existence. The HEGLA algorithm involves automata simultaneously playing zero-sum games with some automata and identical pay-off games with some other automata. Simulation studies are reported to complement the theoretical and algorithmic discussions.\",\"PeriodicalId\":6533,\"journal\":{\"name\":\"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"304 1\",\"pages\":\"48-54\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2018.00015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2018.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Reinforcement Learning Algorithms for Uncertain, Dynamic, Zero-Sum Games
Dynamic zero-sum games are a model of multiagent decision-making that has been well-studied in the mathematical game theory literature. In this paper, we derive a sufficient condition for the existence of a solution to this problem, and then proceed to discuss various reinforcement learning strategies to solve such a dynamic game in the presence of uncertainty where the game matrices at various states as well as the transition probabilities between the states under different agent actions are unknown. A novel algorithm, based on heterogeneous games of learning automata (HEGLA), as well as algorithms based on model-based and model-free reinforcement learning, are presented as possible approaches to learning the solution Markov equilibrium policies when they are assumed to satisfy the sufficient conditions for existence. The HEGLA algorithm involves automata simultaneously playing zero-sum games with some automata and identical pay-off games with some other automata. Simulation studies are reported to complement the theoretical and algorithmic discussions.