Davide Bilò, Sarel Cohen, Tobias Friedrich, Hans Gawendowicz, Nicolas Klodt, Pascal Lenzner, George Skretas
{"title":"Temporal network creation games: the impact of non-locality and terminals","authors":"Davide Bilò, Sarel Cohen, Tobias Friedrich, Hans Gawendowicz, Nicolas Klodt, Pascal Lenzner, George Skretas","doi":"10.1007/s10458-026-09752-w","DOIUrl":"10.1007/s10458-026-09752-w","url":null,"abstract":"<div><p>Our economy, communication, and even our social life crucially depend on networks. These typically emerge from the interaction of many entities, which is why researchers study agent-based models of network formation. In particular, Bilò et al. [1] recently introduced a model where a network is formed by selfish agents corresponding to nodes in a given host network with edges having labels denoting their availability over time. Each agent strategically selects local, i.e., incident, edges to ensure temporal reachability towards everyone at low cost. We explore two novel conceptual features: agents can create non-incident edges, called the <i>global setting</i>, and agents might only want to ensure reachability of a subset of nodes, called the <i>terminal model</i>. For both, we study the existence, structure, and quality of equilibrium networks. For the terminal model, we prove that many properties depend on the number of terminals and we show how to translate equilibrium constructions from the non-terminal model. For the global setting, we show the surprising result that equilibria in the global and the local model are incomparable and we establish a high lower bound on the Price of Anarchy of the global setting that matches the upper bound of the local model. This shows the counter-intuitive fact that allowing agents more flexibility in edge creation does not improve the quality of equilibrium networks. Finally, all of our results hold for the general case where every edge can have multiple labels.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"40 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2026-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147829934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luca Marzari, Changliu Liu, Priya L. Donti, Enrico Marchesini
{"title":"(varepsilon )-retraining reinforcement learning algorithms","authors":"Luca Marzari, Changliu Liu, Priya L. Donti, Enrico Marchesini","doi":"10.1007/s10458-026-09748-6","DOIUrl":"10.1007/s10458-026-09748-6","url":null,"abstract":"<div><p>We present <span>(varepsilon )</span><span>-retrain</span>, a general exploration strategy for reinforcement learning (RL) that encourages adherence to behavioral preferences while preserving the convergence guarantees of the underlying RL algorithm. <span>(varepsilon )</span><span>-retrain</span> maintains a dynamic collection of retrain areas—regions of the state space where the agent previously violated a specified preference—and mixes the standard uniform restart distribution with states from these areas, according to a decaying parameter <span>(varepsilon )</span>. This mixed retraining thus focuses on enforcing the desired behaviors in the collected areas. We develop the theory for both policy and value-based methods, showing that: (i) in policy-based settings, our method retains monotonic improvement bounds; and (ii) in value-based settings, <span>(varepsilon )</span><span>-retrain</span> preserves convergence properties without additional assumptions. The approach is simple to integrate into existing RL algorithms and improves sample efficiency and behavioral adherence in the locomotion, power systems, and navigation tasks tested. These results establish <span>(varepsilon )</span><span>-retrain</span> as a lightweight, theoretically grounded mechanism for incorporating behavioral preferences into RL.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"40 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2026-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-026-09748-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147829631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haoran Wang, Zhuohang Chen, Guang Li, Bo Ma, Chuanhuang Li
{"title":"Chat with UAV – human-UAV interaction based on large language models","authors":"Haoran Wang, Zhuohang Chen, Guang Li, Bo Ma, Chuanhuang Li","doi":"10.1007/s10458-026-09751-x","DOIUrl":"10.1007/s10458-026-09751-x","url":null,"abstract":"<div><p>The future of UAV interaction systems is evolving from engineer-driven to user-driven, aiming to replace traditional predefined Human-UAV Interaction designs. This shift focuses on enabling more personalized task planning and design, thereby achieving a higher quality of interaction experience and greater flexibility, which can be used in many fields, such as agriculture, aerial photography, logistics, and environmental monitoring. However, due to the lack of a common language between users and the UAVs, such interactions are often difficult to be achieved. The developments of Large Language Models possess the ability to understand natural languages and Robots’ (UAVs’) behaviors, marking the possibility of personalized Human-UAV Interaction. Recently, some HUI frameworks based on LLMs have been proposed, but they commonly suffer from difficulties in mixed task planning and execution, leading to low adaptability in complex scenarios. In this paper, we propose a novel dual-agent HUI framework. This framework constructs two independent LLM agents (a task planning agent, and an execution agent) and applies different Prompt Engineering to separately handle the understanding, planning, and execution of tasks. To verify the effectiveness and performance of the framework, we have built a task database covering four typical application scenarios of UAVs and quantified the performance of the HUI framework using three independent metrics. Meanwhile, different LLM models are selected to control the UAVs with compared performance. Our user study experimental results demonstrate that the framework improves the smoothness of HUI and the flexibility of task execution in the tasks scenario we set up, effectively meeting users’ personalized needs.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"40 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2026-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147829630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomy Phan, Joseph Driscoll, Justin Romberg, Sven Koenig
{"title":"Confidence-Based curricula for multi-agent path finding via reinforcement learning","authors":"Thomy Phan, Joseph Driscoll, Justin Romberg, Sven Koenig","doi":"10.1007/s10458-026-09747-7","DOIUrl":"10.1007/s10458-026-09747-7","url":null,"abstract":"<div>\u0000 \u0000 <p>A wide range of real-world applications can be formulated as <i>Multi-Agent Path Finding (MAPF)</i> problem, where the goal is to find collision-free paths for multiple agents with individual start and goal locations. State-of-the-art MAPF solvers are mainly centralized and rely on global information, which limits their scalability and flexibility when facing changes or new maps that require expensive replanning. <i>Multi-agent reinforcement learning (MARL)</i> offers an alternative approach to addressing MAPF problems by learning decentralized policies that generalize across a variety of maps. While there exist some prior works that attempt to connect both areas, the proposed techniques are heavily engineered and very complex due to the integration of many mechanisms that limit generality and are expensive to use. We argue that much simpler and more general approaches are needed to enable decentralized MAPF in a sustainable manner at significantly lower cost. In this paper, we propose <i>Confidence-based Auto-Curriculum for Team Update Stability (CACTUS)</i> as a lightweight MARL approach to decentralized MAPF. CACTUS defines a simple reverse curriculum scheme, where the goal of each agent is randomly placed within an allocation radius around the agent’s start location. The allocation radius increases gradually as all agents improve, which is assessed by a confidence-based measure. In addition, we propose an extension called <i>Confidence- and Conflict-Based Curriculum Learning with Allocation Radius Adaptation</i> (C<span>(^3)</span>LARA), using weighted sampling of goal locations to improve conflict resolution in scenarios of high agent density. We provide a theoretical analysis of the strengths and limitations of CACTUS regarding exploration efficiency, optimality, and multi-agent coordination. We evaluate CACTUS and C<span>(^3)</span>LARA across various maps of different sizes, obstacle densities, and numbers of agents. Our experiments demonstrate better performance and generalization capabilities than state-of-the-art MARL approaches with less than 600,000 trainable parameters, which is less than 5% of the neural network size of current MARL approaches to decentralized MAPF.</p>\u0000 </div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"40 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2026-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-026-09747-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147797165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Strategyproof facility location with prediction: minimizing the maximum cost","authors":"Hau Chan, Jianan Lin, Chenhao Wang","doi":"10.1007/s10458-026-09750-y","DOIUrl":"10.1007/s10458-026-09750-y","url":null,"abstract":"<div>\u0000 \u0000 <p>We study the mechanism design problem of facility location on a metric space in the learning-augmented framework, where mechanisms have access to imperfect predictions of the optimal facility locations. Our objective is to design strategyproof (SP) mechanisms that truthfully elicit agents’ preferences over facility locations and, using the given prediction, select a facility location that approximately minimizes the maximum cost among all agents. In particular, we seek SP mechanisms whose approximation guarantees depend on the prediction error: they should achieve improved performance when the prediction is accurate (the property of <i>consistency</i>) while still ensuring strong worst-case guarantees when the prediction is arbitrarily inaccurate (the property of <i>robustness</i>). On the real line, we characterize all deterministic SP mechanisms with consistency strictly better than 2 and bounded robustness for the maximum cost. We show that any such mechanism must coincide with the MinMaxP mechanism, which returns the prediction if it lies between the two extreme agent locations and otherwise returns the agent location closest to the prediction. For any prediction error <span>(eta ge 0)</span>, we prove that MinMaxP achieves a <span>((1+min (1, eta )))</span>-approximation and that no deterministic SP mechanism can obtain a better approximation ratio. In addition, for two-dimensional spaces with the <span>(ell ^p)</span> distance, we analyze the approximation guarantees of a deterministic mechanism that applies MinMaxP independently on each coordinate, as well as a randomized mechanism that selects between two deterministic mechanisms with carefully chosen probabilities. We further extend these results to the <span>(L_p)</span>-norm social cost objective on the line metric and the maximum cost objective on the tree metric. Finally, we examine the group strategyproofness of the mechanisms.</p>\u0000 </div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"40 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2026-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147796408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated negotiation with no information about partner utility functions using the Tentative Acceptance Unique Offers Protocol","authors":"Yasser Mohammad","doi":"10.1007/s10458-026-09745-9","DOIUrl":"10.1007/s10458-026-09745-9","url":null,"abstract":"<div>\u0000 \u0000 <p>Automated Negotiation (AN) is an approach for reaching agreements between self-interested agents. Early work in automated negotiation focused on proving game-theoretic results for simplified negotiation scenarios (e.g. when all agents’ preferences are common knowledge). More recent research - with some exceptions - focuses on developing novel mediated negotiation mechanisms or effective strategies for the Alternating Offers Protocol (AOP) and its variations. Recently, we proposed the Tentative Acceptance Unique offers (TAU) protocol as an alternative to AOP with promising empirical results (Mohammad 2023). In this paper, we develop a Perfect Bayesian Equilibrium (PBE) strategy for TAU in bilateral negotiations with no information about partner preferences. We analyze its theoretical properties showing that it renders TAU exactly complete, and Pareto-optimal for all bilateral discrete negotiation scenarios, and fair in the sense of a modified Kalai criterion for <i>almost</i> all bilateral discrete negotiation scenarios. Moreover, we empirically compare TAU combined with the proposed strategy against AOP with its state-of-the-art strategies and show empirically that it provides a higher expected advantage for all agents while achieving higher agreement rates, Pareto-optimality, fairness, and welfare on widely used negotiation scenarios.</p>\u0000 </div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"40 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2026-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147737705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Julien Soulé, Jean-Paul Jamont, Michel Occello, Louis-Marie Traonouez, Paul Théron
{"title":"Assisting multi-agent systems design with (mathcal {M}OISE^+) and MARL: The MAMAD method","authors":"Julien Soulé, Jean-Paul Jamont, Michel Occello, Louis-Marie Traonouez, Paul Théron","doi":"10.1007/s10458-026-09740-0","DOIUrl":"10.1007/s10458-026-09740-0","url":null,"abstract":"<div>\u0000 \u0000 <p>Traditional Agent-Oriented Software Engineering (AOSE) methods rely on explicit and expert-driven design for Multi-Agent Systems (MAS), but often lack automation. In contrast, Multi-Agent RL (MARL) and related fields offer automated ways to model environments and learn suitable agent policies. However, integrating these techniques into AOSE remains underexplored partly due to the lack of control, explainability, and unifying frameworks. We propose <b>MOISE+MARL Assisted MAS Design (MAMAD)</b>, a four-activity method framing MAS design as a constrained optimization problem: learning joint policies that maximize rewards while respecting <span>(mathcal {M}OISE^+)</span> roles and goals. The activities include: 1) <b>Modeling</b> the environment, 2) <b>Training</b> under organizational constraints, 3) <b>Analyzing</b> emergent behaviors, 4) <b>Transferring</b> to real-world deployment. We evaluate MAMAD on various environments, showing that the generated MAS exhibit expected performance, compliance with design requirements and are explainable, while reducing manual design overhead.</p>\u0000 </div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"40 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2026-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147642908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Martina Baiardi, Samuele Burattini, Giovanni Ciatto, Danilo Pianini
{"title":"Testing BDI-based multi-agent systems using discrete event simulation","authors":"Martina Baiardi, Samuele Burattini, Giovanni Ciatto, Danilo Pianini","doi":"10.1007/s10458-026-09744-w","DOIUrl":"10.1007/s10458-026-09744-w","url":null,"abstract":"<div>\u0000 \u0000 <p>Multi-agent systems are designed to deal with open, distributed systems with unpredictable dynamics, which makes them inherently hard to test. The value of using simulation for this purpose is recognized in the literature, although achieving sufficient fidelity (i.e., the degree of similarity between the simulation and the real-world system) remains a challenging task. This is exacerbated when dealing with cognitive agent models, such as the Belief Desire Intention (BDI) model, where the agent codebase is not suitable to run unchanged in simulation environments, thus increasing the reality gap between the deployed and simulated systems. We argue that BDI developers should be able to test in simulation <i>the same</i> specification that will be later deployed, with no surrogate representations. Thus, in this paper, we discuss how the control flow of BDI agents can be mapped onto a Discrete Event Simulation (DES), showing that such integration is possible at different degrees of granularity. We substantiate our claims by producing an open-source prototype integration between two pre-existing tools (JaKtA and Alchemist), showing that it is possible to produce a simulation-based testing environment for distributed BDI agents, and that different granularities in mapping BDI agents over DESs may lead to different degrees of fidelity.</p>\u0000 </div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"40 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2026-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-026-09744-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147606446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the complexity of the two-stage majoritarian rule","authors":"Yongjie Yang","doi":"10.1007/s10458-026-09743-x","DOIUrl":"10.1007/s10458-026-09743-x","url":null,"abstract":"<div><p>Sequential voting rules have played a crucial role in shaping decisions within parliamentary and legislative frameworks. After observing that the existing sequential rules fail several fundamental axioms, (Horan and Sprumont, Theoretical Economics, 17(2), 521–537 2022) proposed a sequential rule named two-stage majoritarian rule (TSMR). This paper examines this rule by investigating the complexity of <span>Agenda Control</span>, <span>Coalition Manipulation</span>, <span>Possible Winner</span>, <span>Necessary Winner</span>, and eight standard election control problems. Our study offers a comprehensive insight into the complexity landscape of these problems.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"40 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2026-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147561603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}