{"title":"多代理平衡设计奖励机制的合成(完整版)","authors":"Muhammad Najib, Giuseppe Perelli","doi":"arxiv-2408.10074","DOIUrl":null,"url":null,"abstract":"Mechanism design is a well-established game-theoretic paradigm for designing\ngames to achieve desired outcomes. This paper addresses a closely related but\ndistinct concept, equilibrium design. Unlike mechanism design, the designer's\nauthority in equilibrium design is more constrained; she can only modify the\nincentive structures in a given game to achieve certain outcomes without the\nability to create the game from scratch. We study the problem of equilibrium\ndesign using dynamic incentive structures, known as reward machines. We use\nweighted concurrent game structures for the game model, with goals (for the\nplayers and the designer) defined as mean-payoff objectives. We show how reward\nmachines can be used to represent dynamic incentives that allocate rewards in a\nmanner that optimises the designer's goal. We also introduce the main decision\nproblem within our framework, the payoff improvement problem. This problem\nessentially asks whether there exists a dynamic incentive (represented by some\nreward machine) that can improve the designer's payoff by more than a given\nthreshold value. We present two variants of the problem: strong and weak. We\ndemonstrate that both can be solved in polynomial time using a Turing machine\nequipped with an NP oracle. Furthermore, we also establish that these variants\nare either NP-hard or coNP-hard. Finally, we show how to synthesise the\ncorresponding reward machine if it exists.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Synthesis of Reward Machines for Multi-Agent Equilibrium Design (Full Version)\",\"authors\":\"Muhammad Najib, Giuseppe Perelli\",\"doi\":\"arxiv-2408.10074\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mechanism design is a well-established game-theoretic paradigm for designing\\ngames to achieve desired outcomes. This paper addresses a closely related but\\ndistinct concept, equilibrium design. Unlike mechanism design, the designer's\\nauthority in equilibrium design is more constrained; she can only modify the\\nincentive structures in a given game to achieve certain outcomes without the\\nability to create the game from scratch. We study the problem of equilibrium\\ndesign using dynamic incentive structures, known as reward machines. We use\\nweighted concurrent game structures for the game model, with goals (for the\\nplayers and the designer) defined as mean-payoff objectives. We show how reward\\nmachines can be used to represent dynamic incentives that allocate rewards in a\\nmanner that optimises the designer's goal. We also introduce the main decision\\nproblem within our framework, the payoff improvement problem. This problem\\nessentially asks whether there exists a dynamic incentive (represented by some\\nreward machine) that can improve the designer's payoff by more than a given\\nthreshold value. We present two variants of the problem: strong and weak. We\\ndemonstrate that both can be solved in polynomial time using a Turing machine\\nequipped with an NP oracle. Furthermore, we also establish that these variants\\nare either NP-hard or coNP-hard. Finally, we show how to synthesise the\\ncorresponding reward machine if it exists.\",\"PeriodicalId\":501315,\"journal\":{\"name\":\"arXiv - CS - Multiagent Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Multiagent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.10074\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multiagent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.10074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Synthesis of Reward Machines for Multi-Agent Equilibrium Design (Full Version)
Mechanism design is a well-established game-theoretic paradigm for designing
games to achieve desired outcomes. This paper addresses a closely related but
distinct concept, equilibrium design. Unlike mechanism design, the designer's
authority in equilibrium design is more constrained; she can only modify the
incentive structures in a given game to achieve certain outcomes without the
ability to create the game from scratch. We study the problem of equilibrium
design using dynamic incentive structures, known as reward machines. We use
weighted concurrent game structures for the game model, with goals (for the
players and the designer) defined as mean-payoff objectives. We show how reward
machines can be used to represent dynamic incentives that allocate rewards in a
manner that optimises the designer's goal. We also introduce the main decision
problem within our framework, the payoff improvement problem. This problem
essentially asks whether there exists a dynamic incentive (represented by some
reward machine) that can improve the designer's payoff by more than a given
threshold value. We present two variants of the problem: strong and weak. We
demonstrate that both can be solved in polynomial time using a Turing machine
equipped with an NP oracle. Furthermore, we also establish that these variants
are either NP-hard or coNP-hard. Finally, we show how to synthesise the
corresponding reward machine if it exists.