Mechanism design for public projects via three machine learning based approaches

IF 2.6 3区计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS

Autonomous Agents and Multi-Agent Systems Pub Date : 2024-04-20 DOI:10.1007/s10458-024-09647-8

Mingyu Guo, Diksha Goel, Guanhua Wang, Runqi Guo, Yuko Sakurai, Muhammad Ali Babar

{"title":"Mechanism design for public projects via three machine learning based approaches","authors":"Mingyu Guo, Diksha Goel, Guanhua Wang, Runqi Guo, Yuko Sakurai, Muhammad Ali Babar","doi":"10.1007/s10458-024-09647-8","DOIUrl":null,"url":null,"abstract":"<div>We study mechanism design for nonexcludable and excludable binary public project problems. Our aim is to maximize the expected number of consumers and the expected agents’ welfare. We first show that for the nonexcludable public project model, there is no need for machine learning based mechanism design. We identify a sufficient condition on the prior distribution for the existing conservative equal costs mechanism to be the optimal strategy-proof and individually rational mechanism. For general distributions, we propose a dynamic program that solves for the optimal mechanism. For the excludable public project model, we identify a similar sufficient condition for the existing serial cost sharing mechanism to be optimal for 2 and 3 agents. We derive a numerical upper bound and use it to show that for several common distributions, the serial cost sharing mechanism is close to optimality. The serial cost sharing mechanism is not optimal in general. We propose three machine learning based approaches for designing better performing mechanisms. We focus on the family of largest unanimous mechanisms, which characterizes all strategy-proof and individually rational mechanisms for the excludable public project model. A largest unanimous mechanism describes an iterative mechanism, which is defined by an exponential number of mechanism parameters. Our first approach describes the largest unanimous mechanism family using a neural network and training is carried out by minimizing a cost function that combines the mechanism design objective and the constraint violation penalty. We interpret the largest unanimous mechanisms as price-oriented rationing-free (PORF) mechanisms, which enables us to move the mechanisms’ iterative decision making off the neural network, to a separate simulation process, therefore avoiding the vanishing gradient problem. We also feed the prior distribution’s analytical form into the cost function to achieve high-quality gradients for efficient training. Our second approach treats the mechanism design task as a Markov Decision Process with an exponential number of states. During the Markov decision process, the non-consumers are gradually removed from the system. We train multiple neural networks, each for a different number of remaining agents, to learn the optimal value function on the states. Training is carried out by supervised learning toward a set of manually prepared base cases and the Bellman equation. Our third approach is based on reinforcement learning for a Partially Observable Markov Decision Process. Each RL episode randomly draws a type profile, which is hidden from the RL agent (mechanism designer). The RL agent only observes which cost share offers have been accepted under the largest unanimous mechanism under discussion. We use a continuous action space reinforcement learning approach to adjust the offer policy (i.e., adjust mechanism parameters). Lastly, our first two approaches use “supervision to manual mechanisms” as a systematic way for network initialization, which is potentially valuable for machine learning based mechanism design in general.</div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"38 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-024-09647-8.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Autonomous Agents and Multi-Agent Systems","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10458-024-09647-8","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

We study mechanism design for nonexcludable and excludable binary public project problems. Our aim is to maximize the expected number of consumers and the expected agents’ welfare. We first show that for the nonexcludable public project model, there is no need for machine learning based mechanism design. We identify a sufficient condition on the prior distribution for the existing conservative equal costs mechanism to be the optimal strategy-proof and individually rational mechanism. For general distributions, we propose a dynamic program that solves for the optimal mechanism. For the excludable public project model, we identify a similar sufficient condition for the existing serial cost sharing mechanism to be optimal for 2 and 3 agents. We derive a numerical upper bound and use it to show that for several common distributions, the serial cost sharing mechanism is close to optimality. The serial cost sharing mechanism is not optimal in general. We propose three machine learning based approaches for designing better performing mechanisms. We focus on the family of largest unanimous mechanisms, which characterizes all strategy-proof and individually rational mechanisms for the excludable public project model. A largest unanimous mechanism describes an iterative mechanism, which is defined by an exponential number of mechanism parameters. Our first approach describes the largest unanimous mechanism family using a neural network and training is carried out by minimizing a cost function that combines the mechanism design objective and the constraint violation penalty. We interpret the largest unanimous mechanisms as price-oriented rationing-free (PORF) mechanisms, which enables us to move the mechanisms’ iterative decision making off the neural network, to a separate simulation process, therefore avoiding the vanishing gradient problem. We also feed the prior distribution’s analytical form into the cost function to achieve high-quality gradients for efficient training. Our second approach treats the mechanism design task as a Markov Decision Process with an exponential number of states. During the Markov decision process, the non-consumers are gradually removed from the system. We train multiple neural networks, each for a different number of remaining agents, to learn the optimal value function on the states. Training is carried out by supervised learning toward a set of manually prepared base cases and the Bellman equation. Our third approach is based on reinforcement learning for a Partially Observable Markov Decision Process. Each RL episode randomly draws a type profile, which is hidden from the RL agent (mechanism designer). The RL agent only observes which cost share offers have been accepted under the largest unanimous mechanism under discussion. We use a continuous action space reinforcement learning approach to adjust the offer policy (i.e., adjust mechanism parameters). Lastly, our first two approaches use “supervision to manual mechanisms” as a systematic way for network initialization, which is potentially valuable for machine learning based mechanism design in general.

Abstract Image

查看原文本刊更多论文

通过三种基于机器学习的方法进行公共项目的机制设计

我们研究的是非排他性和排他性二元公共项目问题的机制设计。我们的目标是最大化消费者的预期数量和代理人的预期福利。我们首先证明，对于非排他性公共项目模型，不需要基于机器学习的机制设计。我们确定了一个先验分布的充分条件，即现有的保守等价机制是最优的防策略和个体理性机制。对于一般分布，我们提出了一种动态程序来求解最优机制。对于可排除的公共项目模型，我们发现了一个类似的充分条件，即对于 2 个和 3 个代理人，现有的序列成本分摊机制是最优的。我们推导出了一个数值上界，并用它来证明，对于几种常见的分布，序列成本分摊机制接近最优。串行成本分摊机制在一般情况下并非最优。我们提出了三种基于机器学习的方法来设计性能更好的机制。我们将重点放在最大一致机制家族上，该家族描述了可排除公共项目模型中所有无策略且各自合理的机制。最大一致机制描述了一种迭代机制，它由指数数量的机制参数定义。我们的第一种方法是利用神经网络描述最大一致机制族，并通过最小化成本函数进行训练，该成本函数结合了机制设计目标和违反约束惩罚。我们将最大一致机制解释为价格导向的无配给（PORF）机制，这使我们能够将机制的迭代决策从神经网络转移到单独的模拟过程中，从而避免梯度消失问题。我们还将先验分布的分析形式输入成本函数，以获得高质量梯度，从而实现高效训练。我们的第二种方法将机制设计任务视为具有指数级状态数的马尔可夫决策过程。在马尔可夫决策过程中，非消费者会逐渐从系统中剔除。我们训练多个神经网络，每个神经网络针对不同数量的剩余代理人，以学习状态的最优值函数。训练是通过对一组人工准备的基本案例和贝尔曼方程的监督学习进行的。我们的第三种方法基于部分可观测马尔可夫决策过程的强化学习。每个 RL 事件都会随机绘制一个类型轮廓，而这个类型轮廓对 RL 代理（机制设计者）是隐藏的。RL 代理只观察在讨论中的最大一致机制下哪些成本分摊提议被接受。我们使用连续行动空间强化学习方法来调整出价策略（即调整机制参数）。最后，我们的前两种方法使用 "监督手动机制 "作为网络初始化的系统方法，这对基于机器学习的一般机制设计具有潜在价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Autonomous Agents and Multi-Agent Systems 工程技术-计算机：人工智能

CiteScore

6.00

自引率

5.30%

发文量

审稿时长

>12 weeks

期刊介绍： This is the official journal of the International Foundation for Autonomous Agents and Multi-Agent Systems. It provides a leading forum for disseminating significant original research results in the foundations, theory, development, analysis, and applications of autonomous agents and multi-agent systems. Coverage in Autonomous Agents and Multi-Agent Systems includes, but is not limited to: Agent decision-making architectures and their evaluation, including: cognitive models; knowledge representation; logics for agency; ontological reasoning; planning (single and multi-agent); reasoning (single and multi-agent) Cooperation and teamwork, including: distributed problem solving; human-robot/agent interaction; multi-user/multi-virtual-agent interaction; coalition formation; coordination Agent communication languages, including: their semantics, pragmatics, and implementation; agent communication protocols and conversations; agent commitments; speech act theory Ontologies for agent systems, agents and the semantic web, agents and semantic web services, Grid-based systems, and service-oriented computing Agent societies and societal issues, including: artificial social systems; environments, organizations and institutions; ethical and legal issues; privacy, safety and security; trust, reliability and reputation Agent-based system development, including: agent development techniques, tools and environments; agent programming languages; agent specification or validation languages Agent-based simulation, including: emergent behavior; participatory simulation; simulation techniques, tools and environments; social simulation Agreement technologies, including: argumentation; collective decision making; judgment aggregation and belief merging; negotiation; norms Economic paradigms, including: auction and mechanism design; bargaining and negotiation; economically-motivated agents; game theory (cooperative and non-cooperative); social choice and voting Learning agents, including: computational architectures for learning agents; evolution, adaptation; multi-agent learning. Robotic agents, including: integrated perception, cognition, and action; cognitive robotics; robot planning (including action and motion planning); multi-robot systems. Virtual agents, including: agents in games and virtual environments; companion and coaching agents; modeling personality, emotions; multimodal interaction; verbal and non-verbal expressiveness Significant, novel applications of agent technology Comprehensive reviews and authoritative tutorials of research and practice in agent systems Comprehensive and authoritative reviews of books dealing with agents and multi-agent systems.