Sándor P. Fekete, Phillip Keldenich, Ramin Kosfeld, Christian Rieck, Christian Scheffer
{"title":"Connected coordinated motion planning with bounded stretch","authors":"Sándor P. Fekete, Phillip Keldenich, Ramin Kosfeld, Christian Rieck, Christian Scheffer","doi":"10.1007/s10458-023-09626-5","DOIUrl":"10.1007/s10458-023-09626-5","url":null,"abstract":"<div><p>We consider the problem of connected coordinated motion planning for a large collective of simple, identical robots: From a given start grid configuration of robots, we need to reach a desired target configuration via a sequence of parallel, collision-free robot motions, such that the set of robots induces a connected grid graph at all integer times. The objective is to minimize the <i>makespan</i> of the motion schedule, i.e., to reach the new configuration in a minimum amount of time. We show that this problem is <span>NP</span>-complete, even for deciding whether a makespan of 2 can be achieved, while it is possible to check in polynomial time whether a makespan of 1 can be achieved. On the algorithmic side, we establish simultaneous constant-factor approximation for two fundamental parameters, by achieving <i>constant stretch</i> for <i>constant scale</i>. Scaled shapes (which arise by increasing all dimensions of a given object by the same multiplicative factor) have been considered in previous seminal work on self-assembly, often with unbounded or logarithmic scale factors; we provide methods for a generalized scale factor, bounded by a constant. Moreover, our algorithm achieves a <i>constant stretch factor</i>: If mapping the start configuration to the target configuration requires a maximum Manhattan distance of <i>d</i>, then the total duration of our overall schedule is <span>(mathcal {O}(d))</span>, which is optimal up to constant factors.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"37 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-023-09626-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50491884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianglin Qiao, Dave de Jonge, Dongmo Zhang, Simeon Simoff, Carles Sierra, Bo Du
{"title":"Price of anarchy of traffic assignment with exponential cost functions","authors":"Jianglin Qiao, Dave de Jonge, Dongmo Zhang, Simeon Simoff, Carles Sierra, Bo Du","doi":"10.1007/s10458-023-09625-6","DOIUrl":"10.1007/s10458-023-09625-6","url":null,"abstract":"<div><p>The rapid evolution of technology in connected automated and autonomous vehicles offers immense potential for revolutionizing future intelligent traffic control and management. This potential is exemplified by the diverse range of control paradigms, ranging from self-routing to centralized control. However, the selection among these paradigms is beyond technical consideration but a delicate balance between autonomous decision-making and holistic system optimization. A pivotal quantitative parameter in navigating this balance is the concept of the “price of anarchy” (PoA) inherent in autonomous decision frameworks. This paper analyses the price of anarchy for road networks with traffic of CAV. We model a traffic network as a routing game in which vehicles are selfish agents who choose routes to travel autonomously to minimize travel delays caused by road congestion. Unlike existing research in which the latency function of road congestion was based on polynomial functions like the well-known BPR function, we focus on routing games where an exponential function can specify the latency of road traffic. We first calculate a tight upper bound for the price of anarchy for this class of games and then compare this result with the tight upper bound of the PoA for routing games with the BPR latency function. The comparison shows that as long as the traffic volume is lower than the road capacity, the tight upper bound of the PoA of the games with the exponential function is lower than the corresponding value with the BPR function. Finally, numerical results based on real-world traffic data demonstrate that the exponential function can approximate road latency as close as the BPR function with even tighter exponential parameters, which results in a relatively lower upper bound.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"37 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50447098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cleber J. Amaral, Jomi F. Hübner, Stephen Cranefield
{"title":"Generating and choosing organisations for multi-agent systems","authors":"Cleber J. Amaral, Jomi F. Hübner, Stephen Cranefield","doi":"10.1007/s10458-023-09623-8","DOIUrl":"10.1007/s10458-023-09623-8","url":null,"abstract":"<div><p>The design of organisations is a complex and laborious task. It is the subject of recent studies, which define models to automatically perform this task. However, existing models constrain the space of possible solutions by requiring a priori definitions of organisational roles and usually are not suitable for planning resource use. This paper presents GoOrg, a model that uses as input a set of goals and a set of available agents to generate different arrangements of organisational structures made up of synthesised organisational positions. The most distinguishing characteristics of GoOrg are the use of organisational positions instead of roles and that positions are automatically synthesised rather than required as a priori defined inputs. These characteristics facilitate the parametrisation, the use for resource planning and the chance of finding feasible solutions. This paper also introduces two model extensions, which define processes and constraints that illustrate how GoOrg suits different domains. Among aspects that surround an organisation design, this paper discusses models’ input, agents’ abstractions and resource planning.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"37 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-023-09623-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50511749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effect of asynchronous execution and imperfect communication on max-sum belief propagation","authors":"Roie Zivan, Ben Rachmut, Omer Perry, William Yeoh","doi":"10.1007/s10458-023-09621-w","DOIUrl":"10.1007/s10458-023-09621-w","url":null,"abstract":"<div><p>Max-sum is a version of belief propagation that was adapted for solving distributed constraint optimization problems. It has been studied theoretically and empirically, extended to versions that improve solution quality and converge rapidly, and is applicable to multiple distributed applications. The algorithm was presented both as synchronous and asynchronous algorithms. However, neither the differences in the performance of the two execution versions nor the implications of imperfect communication (i.e., massage delay and message loss) on the two versions have been investigated to the best of our knowledge. We contribute to the body of knowledge on Max-sum by: (1) Establishing the theoretical differences between the two execution versions of the algorithm, focusing on the construction of beliefs; (2) Empirically evaluating the differences between the solutions generated by the two versions of the algorithm, with and without message delay or loss; and (3) Establishing both theoretically and empirically the positive effect of damping on reducing the differences between the two versions. Our results indicate that, in contrast to recent published results indicating that message latency has a drastic (positive) effect on the performance of distributed local search algorithms, the effect of imperfect communication on Damped Max-sum (DMS) is minor. The version of Max-sum that includes both damping and splitting of function nodes converges to high quality solutions very fast, even when a large percentage of the messages sent by agents do not arrive at their destinations. Moreover, the quality of solutions in the different versions of DMS is dependent of the number of messages that were received by the agents, regardless of the amount of time they were delayed or if these messages are only a portion of the total number of messages that was sent by the agents.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"37 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50482745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fairness criteria for allocating indivisible chores: connections and efficiencies","authors":"Ankang Sun, Bo Chen, Xuan Vinh Doan","doi":"10.1007/s10458-023-09618-5","DOIUrl":"10.1007/s10458-023-09618-5","url":null,"abstract":"<div><p>We study several fairness notions in allocating indivisible <i>chores</i> (i.e., items with disutilities) to agents who have additive and submodular cost functions. The fairness criteria we are concerned with are envy-free up to any item, envy-free up to one item, maximin share (MMS), and pairwise maximin share (PMMS), which are proposed as relaxations of envy-freeness in the setting of additive cost functions. For allocations under each fairness criterion, we establish their approximation guarantee for other fairness criteria. Under the additive setting, our results show strong connections between these fairness criteria and, at the same time, reveal intrinsic differences between goods allocation and chores allocation. However, such strong relationships cannot be inherited by the submodular setting, under which PMMS and MMS are no longer relaxations of envy-freeness and, even worse, few non-trivial guarantees exist. We also investigate efficiency loss under these fairness constraints and establish their prices of fairness.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"37 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-023-09618-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46467925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thiago Freitas dos Santos, Nardine Osman, Marco Schorlemmer
{"title":"A multi-scenario approach to continuously learn and understand norm violations","authors":"Thiago Freitas dos Santos, Nardine Osman, Marco Schorlemmer","doi":"10.1007/s10458-023-09619-4","DOIUrl":"10.1007/s10458-023-09619-4","url":null,"abstract":"<div><p>Using norms to guide and coordinate interactions has gained tremendous attention in the multiagent community. However, new challenges arise as the interest moves towards dynamic socio-technical systems, where human and software agents interact, and interactions are required to adapt to changing human needs. For instance, different agents (human or software) might not have the same understanding of what it means to violate a norm (e.g., what characterizes hate speech), or their understanding of a norm might change over time (e.g., what constitutes an acceptable response time). The challenge is to address these issues by learning to detect norm violations from the limited interaction data and to explain the reasons for such violations. To do that, we propose a framework that combines Machine Learning (ML) models and incremental learning techniques. Our proposal is equipped to solve tasks in both tabular and text classification scenarios. Incremental learning is used to continuously update the base ML models as interactions unfold, ensemble learning is used to handle the imbalance class distribution of the interaction stream, Pre-trained Language Model (PLM) is used to learn from text sentences, and Integrated Gradients (IG) is the interpretability algorithm. We evaluate the proposed approach in the use case of Wikipedia article edits, where interactions revolve around editing articles, and the norm in question is prohibiting vandalism. Results show that the proposed framework can learn to detect norm violation in a setting with data imbalance and concept drift.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"37 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-023-09619-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41757932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katarína Cechlárová, Julien Lesca, Diana Trellová, Martina Hančová, Jozef Hanč
{"title":"Hardness of candidate nomination","authors":"Katarína Cechlárová, Julien Lesca, Diana Trellová, Martina Hančová, Jozef Hanč","doi":"10.1007/s10458-023-09622-9","DOIUrl":"10.1007/s10458-023-09622-9","url":null,"abstract":"<div><p>We consider elections where the set of candidates is split into parties and each party can nominate just one candidate. We study the computational complexity of two problems. The <span>Possible President</span> problem asks whether a given party candidate can become the unique winner of the election for some nominations from other parties. The <span>Necessary President</span> is the problem to decide whether a given candidate will be the unique winner of the election for any possible nominations from other parties. We consider several different voting rules and show that for all of them the <span>Possible President</span> problem is NP-complete, even if the size of each party is at most two; for some voting rules we prove that the <span>Necessary President</span> is coNP-complete. Further, we formulate integer programs to solve the <span>Possible President</span> and <span>Necessary President</span> problems and test them on real and artificial data.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"37 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-023-09622-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48956407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classifying ambiguous identities in hidden-role Stochastic games with multi-agent reinforcement learning","authors":"Shijie Han, Siyuan Li, Bo An, Wei Zhao, Peng Liu","doi":"10.1007/s10458-023-09620-x","DOIUrl":"10.1007/s10458-023-09620-x","url":null,"abstract":"<div><p>Multi-agent reinforcement learning (MARL) is a prevalent learning paradigm for solving stochastic games. In most MARL studies, agents in a game are defined as teammates or enemies beforehand, and the relationships among the agents (i.e., their <i>identities</i>) remain fixed throughout the game. However, in real-world problems, the agent relationships are commonly unknown in advance or dynamically changing. Many multi-party interactions start off by asking: who is on my team? This question arises whether it is the first day at the stock exchange or the kindergarten. Therefore, training policies for such situations in the face of imperfect information and ambiguous <i>identities</i> is an important problem that needs to be addressed. In this work, we develop a novel identity detection reinforcement learning (IDRL) framework that allows an agent to dynamically infer the identities of nearby agents and select an appropriate policy to accomplish the task. In the IDRL framework, a relation network is constructed to deduce the identities of other agents by observing the behaviors of the agents. A danger network is optimized to estimate the risk of false-positive identifications. Beyond that, we propose an intrinsic reward that balances the need to maximize external rewards and accurate identification. After identifying the cooperation-competition pattern among the agents, IDRL applies one of the off-the-shelf MARL methods to learn the policy. To evaluate the proposed method, we conduct experiments on <i>Red-10</i> card-shedding game, and the results show that IDRL achieves superior performance over other state-of-the-art MARL methods. Impressively, the relation network has the par performance to identify the identities of agents with top human players; the danger network reasonably avoids the risk of imperfect identification. The code to reproduce all the reported results is available online at https://github.com/MR-BENjie/IDRL.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"37 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43149265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nieves Montes, Michael Luck, Nardine Osman, Odinaldo Rodrigues, Carles Sierra
{"title":"Combining theory of mind and abductive reasoning in agent-oriented programming","authors":"Nieves Montes, Michael Luck, Nardine Osman, Odinaldo Rodrigues, Carles Sierra","doi":"10.1007/s10458-023-09613-w","DOIUrl":"10.1007/s10458-023-09613-w","url":null,"abstract":"<div><p>This paper presents a novel model, called T<span>om</span>A<span>bd</span>, that endows autonomous agents with Theory of Mind capabilities. T<span>om</span>A<span>bd</span> agents are able to simulate the perspective of the world that their peers have and reason from their perspective. Furthermore, T<span>om</span>A<span>bd</span> agents can reason from the perspective of others down to an <i>arbitrary level of recursion</i>, using Theory of Mind of <span>(n^{text {th}})</span> order. By combining the previous capability with abductive reasoning, T<span>om</span>A<span>bd</span> agents can infer the beliefs that others were relying upon to select their actions, hence putting them in a more informed position when it comes to their own decision-making. We have tested the T<span>om</span>A<span>bd</span> model in the challenging domain of Hanabi, a game characterised by cooperation and imperfect information. Our results show that the abilities granted by the T<span>om</span>A<span>bd</span> model boost the performance of the team along a variety of metrics, including final score, efficiency of communication, and uncertainty reduction.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"37 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-023-09613-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47381284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tongtong Liu, Joe McCalmon, Thai Le, Md Asifur Rahman, Dongwon Lee, Sarra Alqahtani
{"title":"A novel policy-graph approach with natural language and counterfactual abstractions for explaining reinforcement learning agents","authors":"Tongtong Liu, Joe McCalmon, Thai Le, Md Asifur Rahman, Dongwon Lee, Sarra Alqahtani","doi":"10.1007/s10458-023-09615-8","DOIUrl":"10.1007/s10458-023-09615-8","url":null,"abstract":"<div><p>As reinforcement learning (RL) continues to improve and be applied in situations alongside humans, the need to explain the learned behaviors of RL agents to end-users becomes more important. Strategies for explaining the reasoning behind an agent’s policy, called <i>policy-level explanations</i>, can lead to important insights about both the task and the agent’s behaviors. Following this line of research, in this work, we propose a novel approach, named as <span>CAPS</span>, that summarizes an agent’s policy in the form of a directed graph with natural language descriptions. A decision tree based clustering method is utilized to abstract the state space of the task into fewer, condensed states which makes the policy graphs more digestible to end-users. We then use the user-defined predicates to enrich the abstract states with semantic meaning. To introduce counterfactual state explanations to the policy graph, we first identify the critical states in the graph then develop a novel counterfactual explanation method based on action perturbation in those critical states. We generate explanation graphs using <span>CAPS</span> on 5 RL tasks, using both deterministic and stochastic policies. We also evaluate the effectiveness of CAPS on human participants who are not RL experts in two user studies. When provided with our explanation graph, end-users are able to accurately interpret policies of trained RL agents 80% of the time, compared to 10% when provided with the next best baseline and <span>(68.2%)</span> of users demonstrated an increase in their confidence in understanding an agent’s behavior after provided with the counterfactual explanations.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"37 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46086354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}