{"title":"Low variance trust region optimization with independent actors and sequential updates in cooperative multi-agent reinforcement learning","authors":"Bang Giang Le, Viet Cuong Ta","doi":"10.1007/s10458-025-09695-8","DOIUrl":"10.1007/s10458-025-09695-8","url":null,"abstract":"<div><p>Cooperative multi-agent reinforcement learning assumes each agent shares the same reward function and can be trained effectively using the Trust Region framework of single-agent. Instead of relying on other agents’ actions, the independent actors setting considers each agent to act based only on its local information, thus having more flexible applications. However, in the sequential update framework, it is required to re-estimate the joint advantage function after each individual agent’s policy step. Despite the practical success of importance sampling, the updated advantage function suffers from exponentially high variance problems, which likely results in unstable convergence. In this work, we first analyze the high variance advantage both empirically and theoretically. To overcome this limitation, we introduce a clipping objective to control the upper bounds of the advantage fluctuation in sequential updates. With the proposed objective, we provide a monotonic bound with sub-linear convergence to <span>(varepsilon)</span>-Nash Equilibria. We further derive two new practical algorithms using our clipping objective. The experiment results on three popular multi-agent reinforcement learning benchmarks show that our proposed method outperforms the tested baselines in most environments. By carefully analyzing different training settings, our proposed method is highlighted with both stable convergence properties and the desired low advantage variance estimation. For reproducibility purposes, our source code is publicly available at https://github.com/giangbang/Low-Variance-Trust-Region-MARL.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-025-09695-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143446498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An introduction to computational argumentation research from a human argumentation perspective","authors":"Ramon Ruiz-Dolz, Stella Heras, Ana García-Fornes","doi":"10.1007/s10458-025-09692-x","DOIUrl":"10.1007/s10458-025-09692-x","url":null,"abstract":"<div><p>Computational Argumentation studies how human argumentative reasoning can be approached from a computational viewpoint. Human argumentation is a complex process that has been studied from different perspectives (e.g., philosophical or linguistic) and that involves many different aspects beyond pure reasoning, such as the role of emotions, values, social contexts, and practical constraints, which are often overlooked in computational approaches to argumentation. The heterogeneity of human argumentation is present in Computational Argumentation research, in the form of various tasks that approach the main phases of argumentation individually. With the increasing interest of researchers in Artificial Intelligence, we consider that it is of great importance to provide guidance on the Computational Argumentation research area. Thus, in this paper, we present a general overview of Computational Argumentation, from the perspective of how humans argue. For that purpose, the following contributions are produced: (i) a consistent structure for Computational Argumentation research mapped with the human argumentation process; (ii) a collective understanding of the tasks approached by Computational Argumentation and their synergies; (iii) a thorough review of important advances in each of these tasks; and (iv) an analysis and a classification of the future trends in Computational Argumentation research and relevant open challenges in the area.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143396714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ricardo Arend Machado, Arthur da Silva Zelindro Cardoso, Giovani Parente Farias, Eder Mateus Nunes Gonçalves, Diana Francisca Adamatti
{"title":"A formal testing method for multi-agent systems using colored Petri nets","authors":"Ricardo Arend Machado, Arthur da Silva Zelindro Cardoso, Giovani Parente Farias, Eder Mateus Nunes Gonçalves, Diana Francisca Adamatti","doi":"10.1007/s10458-025-09690-z","DOIUrl":"10.1007/s10458-025-09690-z","url":null,"abstract":"<div><p>Autonomy in software, a system’s ability to make decisions and take actions independently without human intervention, is a fundamental characteristic of multi-agent systems. Testing, a crucial phase of software validation, is particularly challenging in multi-agent systems due to its complexity, as the interaction between autonomous agents can result in emergent behaviors and collective intelligence, leading to system properties not found in individual agents. A multi-agent system operates on at least three main dimensions: the individual level, the social level, and the communication interfaces. An organizational model formally defines a multi-agent system’s structure, roles, relationships, and interactions. It represents the social layer, capturing agents’ collective dynamics and dependencies, facilitating coherent and efficient collaboration to achieve individual and collective goals. During the literature review, a gap was identified when testing the social layer of multi-agent systems. This paper presents a testing approach by formally introducing steps to map an organizational model, here <span>(mathcal {M})</span>oise<span>(^+)</span>, into a colored Petri net. This mapping aims to generate a formal system model, which is used to generate and count test cases based on a coverage criterion. Finally, a use case called Inspector was presented to demonstrate the method by generating test cases, executing the test, and identifying execution errors.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143388691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elena Yan, Samuele Burattini, Jomi Fred Hübner, Alessandro Ricci
{"title":"A multi-level explainability framework for engineering and understanding BDI agents","authors":"Elena Yan, Samuele Burattini, Jomi Fred Hübner, Alessandro Ricci","doi":"10.1007/s10458-025-09689-6","DOIUrl":"10.1007/s10458-025-09689-6","url":null,"abstract":"<div><p>As the complexity of software systems rises, explainability - i.e. the ability of systems to provide explanations of their behaviour - becomes a crucial property. This is true for any AI-based systems, including autonomous systems that exhibit decisionmaking capabilities such as multi-agent systems. Although explainabil- ity is generally considered useful to increase the level of trust for end-users, we argue it is also an interesting property for software engineers, developers, and designers to debug and validate the system’s behaviour. In this paper, we propose a multi-level explainability framework for BDI agents to generate explanations of a running system from logs at different levels of abstraction, tailored to different users and their needs. We describe the mapping from logs to explanations, and present a prototype tool based on the JaCaMo platform which implements the framework.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-025-09689-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143110073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Disagree and commit: degrees of argumentation-based agreements","authors":"Timotheus Kampik, Juan Carlos Nieves","doi":"10.1007/s10458-025-09688-7","DOIUrl":"10.1007/s10458-025-09688-7","url":null,"abstract":"<div><p>In cooperative human decision-making, agreements are often not total; a partial degree of agreement is sufficient to commit to a decision and move on, as long as one is somewhat confident that the involved parties are likely to stand by their commitment in the future, given no drastic unexpected changes. In this paper, we introduce the notion of <i>agreement scenarios</i> that allow artificial autonomous agents to reach such agreements, using formal models of argumentation, in particular abstract argumentation and value-based argumentation. We introduce the notions of degrees of satisfaction and (minimum, mean, and median) agreement, as well as a measure of the impact a value in a value-based argumentation framework has on these notions. We then analyze how degrees of agreement are affected when agreement scenarios are expanded with new information, to shed light on the reliability of partial agreements in dynamic scenarios. An implementation of the introduced concepts is provided as part of an argumentation-based reasoning software library.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-025-09688-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143109687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jomi Fred Hübner, Samuele Burattini, Alessandro Ricci, Simon Mayer
{"title":"Reflexive anticipatory reasoning by BDI agents","authors":"Jomi Fred Hübner, Samuele Burattini, Alessandro Ricci, Simon Mayer","doi":"10.1007/s10458-025-09687-8","DOIUrl":"10.1007/s10458-025-09687-8","url":null,"abstract":"<div><p>This paper investigates how predictions about the future behaviour of an agent can be exploited to improve its decision-making in the present. Future states are foreseen by a simulation technique, which is based on models of both the environment and the agent. Although the environment model is usually taken into account for prediction in artificial intelligence (e.g., in automated planning), the agent model receives less attention. We leverage the agent model to speed up the simulation and as a source of alternative decisions. Our proposal bases the agent model on the practical knowledge the developer has given to the agent, especially in the case of BDI agents. This knowledge is thus exploited in the proposed future-concerned reasoning mechanisms. We present a prototype implementation of our approach as well as the results from its evaluation on static and dynamic environments. This allows us to better understand the relation between the improvement in agent decisions and the quality of the knowledge provided by the developer.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143109265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sushmita Gupta, Pallavi Jain, A. Mohanapriya, Vikash Tripathi
{"title":"Budget-feasible egalitarian allocation of conflicting jobs","authors":"Sushmita Gupta, Pallavi Jain, A. Mohanapriya, Vikash Tripathi","doi":"10.1007/s10458-024-09686-1","DOIUrl":"10.1007/s10458-024-09686-1","url":null,"abstract":"<div><p>Allocating conflicting jobs among individuals while respecting a budget constraint for each individual is an optimization problem that arises in various real-world scenarios. In this paper, we consider the situation where each individual derives some satisfaction from each job. We focus on finding a feasible allocation of conflicting jobs that maximize egalitarian cost, i.e., the satisfaction of the individual who is worst-off. To the best of our knowledge, this is the first paper to combine egalitarianism, budget-feasibility, and conflict-freeness in allocations. We provide a systematic study of the computational complexity of finding budget-feasible conflict-free egalitarian allocation and show that our problem generalizes a large number of classical optimization problems. Therefore, unsurprisingly, our problem is NP-hard even for two individuals and when there is no conflict between any jobs. We show that the problem admits algorithms when studied in the realm of approximation algorithms and parameterized algorithms with a host of natural parameters that match and in some cases improve upon the running time of known algorithms.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-024-09686-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142941025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"La VIDA: towards a motivated goal reasoning agent","authors":"Ursula Addison","doi":"10.1007/s10458-024-09685-2","DOIUrl":"10.1007/s10458-024-09685-2","url":null,"abstract":"<div><p>An autonomous agent deployed to operate over extended horizons in uncertain environments will encounter situations for which it was not designed. A class of these situations involves an invalidation of agent goals and limited guidance in establishing a new set of goals to pursue. An agent will benefit from some mechanism that will allow it to pursue new goals under these circumstances such that the goals are broadly useful in its environment and take advantage of its existing skills while aligning with societal norms. We propose augmenting a goal reasoning agent, i.e., an agent that can deliberate on and self-select its goals, with a motivation system that can be used to both constrain and motivate agent behavior. A human-like motivation system coupled with a goal-self concordant selection technique allows the approach to be framed as an optimization problem in which the agent selects goals that have high utility while simultaneously in harmony with its motivations. Over the agent’s operational lifespan its motivation system adjusts incrementally to more closely reflect the reality of its goal reasoning and goal pursuit experiences. Experiments performed with an ablation testing technique comparing the average utility of goals achieved in the presence and absence of a motivation system suggest that the motivated version of the system leads to pursuing more useful goals than the baseline.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142912946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Charles Dickie, Stefan Lauren, Francesco Belardinelli, Antonio Rago, Francesca Toni
{"title":"Aggregating bipolar opinions through bipolar assumption-based argumentation","authors":"Charles Dickie, Stefan Lauren, Francesco Belardinelli, Antonio Rago, Francesca Toni","doi":"10.1007/s10458-024-09684-3","DOIUrl":"10.1007/s10458-024-09684-3","url":null,"abstract":"<div><p>We introduce a novel method to aggregate bipolar argumentation frameworks expressing opinions of different parties in debates. We use Bipolar Assumption-based Argumentation (ABA) as an all-encompassing formalism for bipolar argumentation under different semantics. By leveraging on recent results on judgement aggregation in social choice theory, we prove several preservation results for relevant properties of bipolar ABA using quota and oligarchic rules. Specifically, we prove (positive and negative) results about the preservation of conflict-free, closed, admissible, preferred, complete, set-stable, well-founded and ideal extensions in bipolar ABA, as well as the preservation of acceptability, acyclicity and coherence for individual assumptions. Finally, we illustrate our methodology and results in the context of a case study on opinion aggregation for the treatment of long COVID patients.\u0000</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-024-09684-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142694773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Information gathering in POMDPs using active inference","authors":"Erwin Walraven, Joris Sijs, Gertjan J. Burghouts","doi":"10.1007/s10458-024-09683-4","DOIUrl":"10.1007/s10458-024-09683-4","url":null,"abstract":"<div><p>Gathering information about the environment state is the main goal in several planning tasks for autonomous agents, such as surveillance, inspection and tracking of objects. Such planning tasks are typically modeled using a Partially Observable Markov Decision Process (POMDP), and in the literature several approaches have emerged to consider information gathering during planning and execution. Similar developments can be seen in the field of active inference, which focuses on active information collection in order to be able to reach a goal. Both fields use POMDPs to model the environment, but the underlying principles for action selection are different. In this paper we create a bridge between both research fields by discussing how they relate to each other and how they can be used for information gathering. Our contribution is a tailored approach to model information gathering tasks directly in the active inference framework. A series of experiments demonstrates that our approach enables agents to gather information about the environment state. As a result, active inference becomes an alternative to common POMDP approaches for information gathering, which opens the door towards more cross cutting research at the intersection of both fields. This is advantageous, because recent advancements in POMDP solvers may be used to accelerate active inference, and the principled active inference framework may be used to model POMDP agents that operate in a neurobiologically plausible fashion.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142595293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}