Xiyue Sun, Fabian R. Pieroth, Kyrill Schmid, M. Wirsing, Lenz Belzner
{"title":"有偿激励下迭代囚徒困境下的学习稳定合作","authors":"Xiyue Sun, Fabian R. Pieroth, Kyrill Schmid, M. Wirsing, Lenz Belzner","doi":"10.1109/ICDCSW56584.2022.00031","DOIUrl":null,"url":null,"abstract":"An essential step towards collective intelligence in systems comprised of multiple independent and autonomous agents is that individual decision-makers are capable of acting cooperatively. Cooperation is especially challenging in environ-ments where collective and individual rationality diverge, like in the Prisoner's Dilemma (PD), which is often used to test whether algorithms are capable of circumventing the single non-optimal Nash equilibrium. In this paper, we extend the approach “Learning to Incentivize other Learning Agents” in two ways: 1. We analyze the impact of the payoff matrices on incentive updates, as different payoff matrices could accelerate or decelerate the growth of incentives. 2. We adapt the concept of the market from “Action Markets in Deep Multi-Agent Reinforcement Learning” to iterated PD games as to trade incentives, i.e., the final revenue of the agent is the game revenue minus the incentive it provided, and propose (sufficient) conditions for reaching stable two-way cooperation under specific assumptions.","PeriodicalId":357138,"journal":{"name":"2022 IEEE 42nd International Conference on Distributed Computing Systems Workshops (ICDCSW)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"On Learning Stable Cooperation in the Iterated Prisoner's Dilemma with Paid Incentives\",\"authors\":\"Xiyue Sun, Fabian R. Pieroth, Kyrill Schmid, M. Wirsing, Lenz Belzner\",\"doi\":\"10.1109/ICDCSW56584.2022.00031\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An essential step towards collective intelligence in systems comprised of multiple independent and autonomous agents is that individual decision-makers are capable of acting cooperatively. Cooperation is especially challenging in environ-ments where collective and individual rationality diverge, like in the Prisoner's Dilemma (PD), which is often used to test whether algorithms are capable of circumventing the single non-optimal Nash equilibrium. In this paper, we extend the approach “Learning to Incentivize other Learning Agents” in two ways: 1. We analyze the impact of the payoff matrices on incentive updates, as different payoff matrices could accelerate or decelerate the growth of incentives. 2. We adapt the concept of the market from “Action Markets in Deep Multi-Agent Reinforcement Learning” to iterated PD games as to trade incentives, i.e., the final revenue of the agent is the game revenue minus the incentive it provided, and propose (sufficient) conditions for reaching stable two-way cooperation under specific assumptions.\",\"PeriodicalId\":357138,\"journal\":{\"name\":\"2022 IEEE 42nd International Conference on Distributed Computing Systems Workshops (ICDCSW)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 42nd International Conference on Distributed Computing Systems Workshops (ICDCSW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDCSW56584.2022.00031\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 42nd International Conference on Distributed Computing Systems Workshops (ICDCSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCSW56584.2022.00031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On Learning Stable Cooperation in the Iterated Prisoner's Dilemma with Paid Incentives
An essential step towards collective intelligence in systems comprised of multiple independent and autonomous agents is that individual decision-makers are capable of acting cooperatively. Cooperation is especially challenging in environ-ments where collective and individual rationality diverge, like in the Prisoner's Dilemma (PD), which is often used to test whether algorithms are capable of circumventing the single non-optimal Nash equilibrium. In this paper, we extend the approach “Learning to Incentivize other Learning Agents” in two ways: 1. We analyze the impact of the payoff matrices on incentive updates, as different payoff matrices could accelerate or decelerate the growth of incentives. 2. We adapt the concept of the market from “Action Markets in Deep Multi-Agent Reinforcement Learning” to iterated PD games as to trade incentives, i.e., the final revenue of the agent is the game revenue minus the incentive it provided, and propose (sufficient) conditions for reaching stable two-way cooperation under specific assumptions.