{"title":"“Provably fair” algorithms may perpetuate racial and gender bias: a study of salary dispute resolution","authors":"James Hale, Peter H. Kim, Jonathan Gratch","doi":"10.1007/s10458-025-09703-x","DOIUrl":"10.1007/s10458-025-09703-x","url":null,"abstract":"<div><p>Prior work suggests automated dispute resolution tools using “provably fair” algorithms can address disparities between demographic groups. These methods use multi-criteria elicited preferences from all disputants and satisfy constraints to generate “fair” solutions. However, we analyze the potential for inequity to permeate proposals through the preference elicitation stage. This possibility arises if differences in dispositional attitudes differ between demographics, and those dispositions affect elicited preferences. Specifically, risk aversion plays a prominent role in predicting preferences. Risk aversion predicts a weaker relative preference for <i>salary</i> and a softer within-issue utility for each issue; this leads to worse compensation packages for risk-averse groups. These results raise important questions in AI-value alignment about whether an AI mediator should take explicit preferences at face value. \u0000\u0000</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-025-09703-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Investigating the impact of direct punishment on the emergence of cooperation in multi-agent reinforcement learning systems","authors":"Nayana Dasgupta, Mirco Musolesi","doi":"10.1007/s10458-025-09698-5","DOIUrl":"10.1007/s10458-025-09698-5","url":null,"abstract":"<div><p>Solving the problem of cooperation is fundamentally important for the creation and maintenance of functional societies. Problems of cooperation are omnipresent within human society, with examples ranging from navigating busy road junctions to negotiating treaties. As the use of AI becomes more pervasive throughout society, the need for socially intelligent agents capable of navigating these complex cooperative dilemmas is becoming increasingly evident. Direct punishment is a ubiquitous social mechanism that has been shown to foster the emergence of cooperation in both humans and non-humans. In the natural world, direct punishment is often strongly coupled with partner selection and reputation and used in conjunction with third-party punishment. The interactions between these mechanisms could potentially enhance the emergence of cooperation within populations. However, no previous work has evaluated the learning dynamics and outcomes emerging from multi-agent reinforcement learning populations that combine these mechanisms. This paper addresses this gap. It presents a comprehensive analysis and evaluation of the behaviors and learning dynamics associated with direct punishment, third-party punishment, partner selection, and reputation. Finally, we discuss the implications of using these mechanisms on the design of cooperative AI systems.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-025-09698-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143583433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jannik Peters, Constantin Waubert de Puiseau, Hasan Tercan, Arya Gopikrishnan, Gustavo Adolpho Lucas de Carvalho, Christian Bitter, Tobias Meisen
{"title":"Emergent language: a survey and taxonomy","authors":"Jannik Peters, Constantin Waubert de Puiseau, Hasan Tercan, Arya Gopikrishnan, Gustavo Adolpho Lucas de Carvalho, Christian Bitter, Tobias Meisen","doi":"10.1007/s10458-025-09691-y","DOIUrl":"10.1007/s10458-025-09691-y","url":null,"abstract":"<div><p>The field of emergent language represents a novel area of research within the domain of artificial intelligence, particularly within the context of multi-agent reinforcement learning. Although the concept of studying language emergence is not new, early approaches were primarily concerned with explaining human language formation, with little consideration given to its potential utility for artificial agents. In contrast, studies based on reinforcement learning aim to develop communicative capabilities in agents that are comparable to or even superior to human language. Thus, they extend beyond the learned statistical representations that are common in natural language processing research. This gives rise to a number of fundamental questions, from the prerequisites for language emergence to the criteria for measuring its success. This paper addresses these questions by providing a comprehensive review of relevant scientific publications on emergent language in artificial intelligence. Its objective is to serve as a reference for researchers interested in or proficient in the field. Consequently, the main contributions are the definition and overview of the prevailing terminology, the analysis of existing evaluation methods and metrics, and the description of the identified research gaps.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-025-09691-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143564460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shenghui Chen, Yigit E. Bayiz, David Fridovich-Keil, Ufuk Topcu
{"title":"Relationship design for socially-aware behavior in static games","authors":"Shenghui Chen, Yigit E. Bayiz, David Fridovich-Keil, Ufuk Topcu","doi":"10.1007/s10458-025-09699-4","DOIUrl":"10.1007/s10458-025-09699-4","url":null,"abstract":"<div><p>Autonomous agents can adopt socially-aware behaviors to reduce social costs, mimicking the way animals interact in nature and humans in society. We present a new approach to model socially-aware decision-making that includes two key elements: bounded rationality and inter-agent relationships. We capture the inter-agent relationships by introducing a novel model called a relationship game and encode agents’ bounded rationality using quantal response equilibria. For each relationship game, we define a social cost function and formulate a mechanism design problem to optimize weights for relationships that minimize social cost at the equilibrium. We address the multiplicity of equilibria by presenting the problem in two forms: Min-Max and Min-Min, aimed respectively at minimization of the highest and lowest social costs in the equilibria. We compute the quantal response equilibrium by solving a least-squares problem defined with its Karush-Kuhn-Tucker conditions, and propose two projected gradient descent algorithms to solve the mechanism design problems. Numerical results, including two-lane congestion and congestion with an ambulance, confirm that these algorithms consistently reach the equilibrium with the intended social costs.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143554057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal matchings with one-sided preferences: fixed and cost-based quotas","authors":"K. A. Santhini, Govind S. Sankar, Meghana Nasre","doi":"10.1007/s10458-025-09693-w","DOIUrl":"10.1007/s10458-025-09693-w","url":null,"abstract":"<div><p>We consider the well-studied many-to-one bipartite matching problem of assigning applicants <span>({varvec{mathcal {A}}})</span> to posts <span>({varvec{mathcal {P}}})</span> where applicants rank posts in the order of preference. This setting models many important real-world allocation problems like assigning students to courses, applicants to jobs, amongst many others. In such scenarios, it is natural to ask for an allocation that satisfies guarantees of the form “match at least 80% of applicants to one of their top three choices” or “it is unacceptable to leave more than 10% of applicants unassigned”. The well-studied notions of rank-maximality and fairness fail to capture such requirements due to their property of optimizing extreme ends of the <i>signature</i> of a matching. We, therefore, propose a novel optimality criterion, which we call the “weak dominance ” of ranks.</p><p>We investigate the computational complexity of the new notion of optimality in the setting where posts have associated <i>fixed</i> quotas. We prove that under the fixed quota setting, the problem turns out to be NP-hard under natural restrictions. We provide randomized algorithms in the fixed quota setting when the number of ranks is constant. We also study the problem under a <i>cost-based quota</i> setting and show that a matching that weakly dominates the input signature and has minimum total cost can be computed efficiently. Apart from circumventing the hardness, the cost-based quota setting is motivated by real-world applications like course allocation or school choice where the capacities or quotas need not be rigid. We also show that when the objective is to minimize the maximum cost, the problem under the cost-based quota setting turns out to be NP-hard. To complement the hardness, we provide a randomized algorithm when the number of ranks is constant. We also provide an approximation algorithm which is an asymptotic faster alternative to the randomized algorithm.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143554055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pingping Qu, Chenglong He, Xiaotong Wu, Ershen Wang, Song Xu, Huan Liu, Xinhui Sun
{"title":"Double mixing networks based monotonic value function decomposition algorithm for swarm intelligence in UAVs","authors":"Pingping Qu, Chenglong He, Xiaotong Wu, Ershen Wang, Song Xu, Huan Liu, Xinhui Sun","doi":"10.1007/s10458-025-09700-0","DOIUrl":"10.1007/s10458-025-09700-0","url":null,"abstract":"<div><p>In multi-agent systems, particularly when facing challenges of partial observability, reinforcement learning demonstrates significant autonomous decision-making capabilities. Aiming at addressing resource allocation and collaboration issues in drone swarms operating in dynamic and unknown environments, we propose a novel deep reinforcement learning algorithm, DQMIX. We employ a framework of centralized training with decentralized execution and incorporate a partially observable Markov game model to describe the complex game environment of drone swarms. The core innovation of the DQMIX algorithm lies in its dual-mixing network structure and soft-switching mechanism. Two independent mixing networks handle local Q-values and synthesize them into a global Q-value. This structure enhances decision accuracy and system adaptability under different scenarios and data conditions. The soft-switching module allows the system to transition smoothly between the two networks, selecting the output of the network with smaller TD-errors to enhance decision stability and coherence. Simultaneously, we introduce Hindsight Experience Replay to learn from failed experiences. Experimental results using JSBSim demonstrate that DQMIX provides an effective solution for drone swarm game problems, especially in resource allocation and adversarial environments.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143554056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feiran Jia, Aditya Mate, Zun Li, Shahin Jabbari, Mithun Chakraborty, Milind Tambe, Michael P. Wellman, Yevgeniy Vorobeychik
{"title":"A game-theoretic approach for hierarchical epidemic control","authors":"Feiran Jia, Aditya Mate, Zun Li, Shahin Jabbari, Mithun Chakraborty, Milind Tambe, Michael P. Wellman, Yevgeniy Vorobeychik","doi":"10.1007/s10458-025-09697-6","DOIUrl":"10.1007/s10458-025-09697-6","url":null,"abstract":"<div><p>We design and analyze a multi-level game-theoretic model of hierarchical policy interventions for epidemic control, such as those in response to the COVID-19 pandemic. Our model captures the potentially mismatched priorities among a hierarchy of policy-makers (e.g., federal, state, and local governments) with respect to two cost components that have opposite dependence on the policy strength—post-intervention infection rates and the socio-economic cost of policy implementation. Additionally, our model includes a crucial third factor in decisions: a cost of non-compliance with the policy-maker immediately above in the hierarchy, such as non-compliance of counties with state-level policies. We propose two novel algorithms for approximating solutions to such games. The first is based on best response dynamics (BRD) and exploits the tree structure of the game. The second combines quadratic integer programming (QIP), which enables us to collapse the two lowest levels of the game, with the best response dynamics. We experimentally characterize the scalability and equilibrium approximation quality of our two approaches against model parameters. Finally, we conduct experiments in simulations based on both synthetic and real-world data under various parameter configurations and analyze the resulting (approximate) equilibria to gain insight into the impact of decentralization on overall welfare (measured as the negative sum of costs) as well as emergent properties like social welfare, free-riding, and fairness in cost distribution among policy-makers.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-025-09697-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143496682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Solving multi-agent games on networks","authors":"Yair Vaknin, Amnon Meisels","doi":"10.1007/s10458-025-09696-7","DOIUrl":"10.1007/s10458-025-09696-7","url":null,"abstract":"<div><p>Multi-agent games on networks (GoNs) have nodes that represent agents and edges that represent interactions among agents. A special class of GoNs is composed of 2-players games on each of their edges. General GoNs have games that are played by all agents in each neighborhood. Solutions to games on networks are stable states (i.e., pure Nash equilibria), and in general one is interested in efficient solutions (of high global social welfare). This study addresses the multi-agent aspect of games on networks—a system of multiple agents that compose a game and seek a solution by performing a multi-agent (distributed) algorithm. The agents playing the game are assumed to be strategic and an iterative distributed algorithm is proposed, that lets the agents interact (i.e., negotiate) in neighborhoods in a process that guarantees the convergence of any multi-agent game on network to a globally stable state. The proposed algorithm—the TECon algorithm—iterates, one neighborhood at a time, performing a repeated social choice action. A truth-enforcing mechanism is integrated into the algorithm, collecting the valuations of agents in each neighborhood and computing incentives while eliminating strategic behavior. The proposed method is proven to converge to globally stable states that are at least as efficient as the initial state, for any game on network. A specific version of the algorithm is given for the class of Public Goods Games, where the main properties of the algorithm are guaranteed even when the strategic agents playing the game consider their possible future valuations when interacting. An extensive experimental evaluation on randomly generated games on networks demonstrates that the TECon algorithm converges very rapidly. On general forms of public goods games, the proposed algorithm outperforms former solving methods, where former methods are applicable.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-025-09696-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low variance trust region optimization with independent actors and sequential updates in cooperative multi-agent reinforcement learning","authors":"Bang Giang Le, Viet Cuong Ta","doi":"10.1007/s10458-025-09695-8","DOIUrl":"10.1007/s10458-025-09695-8","url":null,"abstract":"<div><p>Cooperative multi-agent reinforcement learning assumes each agent shares the same reward function and can be trained effectively using the Trust Region framework of single-agent. Instead of relying on other agents’ actions, the independent actors setting considers each agent to act based only on its local information, thus having more flexible applications. However, in the sequential update framework, it is required to re-estimate the joint advantage function after each individual agent’s policy step. Despite the practical success of importance sampling, the updated advantage function suffers from exponentially high variance problems, which likely results in unstable convergence. In this work, we first analyze the high variance advantage both empirically and theoretically. To overcome this limitation, we introduce a clipping objective to control the upper bounds of the advantage fluctuation in sequential updates. With the proposed objective, we provide a monotonic bound with sub-linear convergence to <span>(varepsilon)</span>-Nash Equilibria. We further derive two new practical algorithms using our clipping objective. The experiment results on three popular multi-agent reinforcement learning benchmarks show that our proposed method outperforms the tested baselines in most environments. By carefully analyzing different training settings, our proposed method is highlighted with both stable convergence properties and the desired low advantage variance estimation. For reproducibility purposes, our source code is publicly available at https://github.com/giangbang/Low-Variance-Trust-Region-MARL.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-025-09695-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143446498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An introduction to computational argumentation research from a human argumentation perspective","authors":"Ramon Ruiz-Dolz, Stella Heras, Ana García-Fornes","doi":"10.1007/s10458-025-09692-x","DOIUrl":"10.1007/s10458-025-09692-x","url":null,"abstract":"<div><p>Computational Argumentation studies how human argumentative reasoning can be approached from a computational viewpoint. Human argumentation is a complex process that has been studied from different perspectives (e.g., philosophical or linguistic) and that involves many different aspects beyond pure reasoning, such as the role of emotions, values, social contexts, and practical constraints, which are often overlooked in computational approaches to argumentation. The heterogeneity of human argumentation is present in Computational Argumentation research, in the form of various tasks that approach the main phases of argumentation individually. With the increasing interest of researchers in Artificial Intelligence, we consider that it is of great importance to provide guidance on the Computational Argumentation research area. Thus, in this paper, we present a general overview of Computational Argumentation, from the perspective of how humans argue. For that purpose, the following contributions are produced: (i) a consistent structure for Computational Argumentation research mapped with the human argumentation process; (ii) a collective understanding of the tasks approached by Computational Argumentation and their synergies; (iii) a thorough review of important advances in each of these tasks; and (iv) an analysis and a classification of the future trends in Computational Argumentation research and relevant open challenges in the area.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143396714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}