简单协同适应游戏中的策略聚合

Proceedings of the 2015 ACM Conference on Foundations of Genetic Algorithms XIII Pub Date : 2015-01-17 DOI:10.1145/2725494.2725503

T. Jansen, G. Ochoa, C. Zarges

{"title":"简单协同适应游戏中的策略聚合","authors":"T. Jansen, G. Ochoa, C. Zarges","doi":"10.1145/2725494.2725503","DOIUrl":null,"url":null,"abstract":"Simultaneously co-adapting agents in an uncooperative setting can result in a non-stationary environment where optimisation or learning is difficult and where the agents' strategies may not converge to solutions. This work looks at simple simultaneous-move games with two or three actions and two or three players. Fictitious play is an old but popular algorithm that can converge to solutions, albeit slowly, in self-play in games like these. It models its opponents assuming that they use stationary strategies and plays a best-response strategy to these models. We propose two new variants of fictitious play that remove this assumption and explicitly assume that the opponents use dynamic strategies. The opponent's strategy is predicted using a sequence prediction method in the first variant and a change detection method in the second variant. Empirical results show that our variants converge faster than fictitious play. However, they do not always converge exactly to correct solutions. For change detection, this is a very small number of cases, but for sequence prediction there are many. The convergence of sequence prediction is improved by combining it with fictitious play. Also, unlike in fictitious play, our variants converge to solutions in the difficult Shapley's and Jordan's games.","PeriodicalId":112331,"journal":{"name":"Proceedings of the 2015 ACM Conference on Foundations of Genetic Algorithms XIII","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Convergence of Strategies in Simple Co-Adapting Games\",\"authors\":\"T. Jansen, G. Ochoa, C. Zarges\",\"doi\":\"10.1145/2725494.2725503\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Simultaneously co-adapting agents in an uncooperative setting can result in a non-stationary environment where optimisation or learning is difficult and where the agents' strategies may not converge to solutions. This work looks at simple simultaneous-move games with two or three actions and two or three players. Fictitious play is an old but popular algorithm that can converge to solutions, albeit slowly, in self-play in games like these. It models its opponents assuming that they use stationary strategies and plays a best-response strategy to these models. We propose two new variants of fictitious play that remove this assumption and explicitly assume that the opponents use dynamic strategies. The opponent's strategy is predicted using a sequence prediction method in the first variant and a change detection method in the second variant. Empirical results show that our variants converge faster than fictitious play. However, they do not always converge exactly to correct solutions. For change detection, this is a very small number of cases, but for sequence prediction there are many. The convergence of sequence prediction is improved by combining it with fictitious play. Also, unlike in fictitious play, our variants converge to solutions in the difficult Shapley's and Jordan's games.\",\"PeriodicalId\":112331,\"journal\":{\"name\":\"Proceedings of the 2015 ACM Conference on Foundations of Genetic Algorithms XIII\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-01-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2015 ACM Conference on Foundations of Genetic Algorithms XIII\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2725494.2725503\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 ACM Conference on Foundations of Genetic Algorithms XIII","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2725494.2725503","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

同时，在非合作环境中，共同适应的智能体可能会导致非平稳环境，在这种环境中，优化或学习是困难的，智能体的策略可能不会收敛到解决方案。这项研究着眼于具有两到三个动作和两到三个玩家的简单同步移动游戏。虚拟游戏是一种古老但流行的算法，它可以在这类游戏的自我体验中收敛到解决方案，尽管速度很慢。它对对手进行建模，假设他们使用固定策略，并对这些模型采取最佳反应策略。我们提出了两种新的虚拟游戏变体，它们消除了这一假设，并明确假设对手使用动态策略。在第一种变体中使用序列预测方法和在第二种变体中使用变化检测方法来预测对手的策略。实证结果表明，我们的变量收敛速度比虚拟游戏快。然而，它们并不总是收敛到正确的解。对于变更检测来说，这种情况非常少，但是对于序列预测来说，这种情况很多。将序列预测与虚拟游戏相结合，提高了序列预测的收敛性。此外，与虚拟游戏不同，我们的变体在Shapley和Jordan的困难游戏中收敛到解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Convergence of Strategies in Simple Co-Adapting Games

Simultaneously co-adapting agents in an uncooperative setting can result in a non-stationary environment where optimisation or learning is difficult and where the agents' strategies may not converge to solutions. This work looks at simple simultaneous-move games with two or three actions and two or three players. Fictitious play is an old but popular algorithm that can converge to solutions, albeit slowly, in self-play in games like these. It models its opponents assuming that they use stationary strategies and plays a best-response strategy to these models. We propose two new variants of fictitious play that remove this assumption and explicitly assume that the opponents use dynamic strategies. The opponent's strategy is predicted using a sequence prediction method in the first variant and a change detection method in the second variant. Empirical results show that our variants converge faster than fictitious play. However, they do not always converge exactly to correct solutions. For change detection, this is a very small number of cases, but for sequence prediction there are many. The convergence of sequence prediction is improved by combining it with fictitious play. Also, unlike in fictitious play, our variants converge to solutions in the difficult Shapley's and Jordan's games.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2015 ACM Conference on Foundations of Genetic Algorithms XIII

自引率

0.00%

发文量