Argyrios Deligkas, Aris Filos-Ratsikas, Alexandros A. Voudouris
{"title":"Truthful interval covering","authors":"Argyrios Deligkas, Aris Filos-Ratsikas, Alexandros A. Voudouris","doi":"10.1007/s10458-024-09673-6","DOIUrl":"10.1007/s10458-024-09673-6","url":null,"abstract":"<div><p>We initiate the study of a novel problem in mechanism design without money, which we term <i>Truthful Interval Covering</i> (TIC). An instance of TIC consists of a set of agents each associated with an individual interval on a line, and the objective is to decide where to place a <i>covering interval</i> to minimize the total social or egalitarian cost of the agents, which is determined by the intersection of this interval with their individual ones. This fundamental problem can model situations of provisioning a public good, such as the use of power generators to prevent or mitigate load shedding in developing countries. In the strategic version of the problem, the agents wish to minimize their individual costs, and might misreport the position and/or length of their intervals to achieve that. Our goal is to design <i>truthful</i> mechanisms to prevent such strategic misreports and achieve good approximations to the best possible social or egalitarian cost. We consider the fundamental setting of known intervals with equal lengths and provide tight bounds on the approximation ratios achieved by truthful deterministic mechanisms. For the social cost, we also design a randomized truthful mechanism that outperforms all possible deterministic ones. Finally, we highlight a plethora of natural extensions of our model for future work, as well as some natural limitations of those settings.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"38 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-024-09673-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142191088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"One-sided matching markets with endowments: equilibria and algorithms","authors":"Jugal Garg, Thorben Tröbst, Vijay Vazirani","doi":"10.1007/s10458-024-09670-9","DOIUrl":"10.1007/s10458-024-09670-9","url":null,"abstract":"<div><p>The Arrow–Debreu extension of the classic Hylland–Zeckhauser scheme (Hylland and Zeckhauser in J Polit Econ 87(2):293–314, 1979) for a one-sided matching market—called ADHZ in this paper—has natural applications but has instances which do not admit equilibria. By introducing approximation, we define the <span>(epsilon)</span><i>-approximate ADHZ model</i> and give the following results. 1. Existence of equilibrium under linear utility functions. We prove that the equilibrium allocation satisfies Pareto optimality, approximate envy-freeness, and approximate weak core stability. 2. A combinatorial polynomial time algorithm for an <span>(epsilon)</span>-approximate ADHZ equilibrium for the case of dichotomous, and more generally bi-valued, utilities. 3. An instance of ADHZ, with dichotomous utilities and a strongly connected demand graph, which does not admit an equilibrium. 4. A rational convex program for HZ under dichotomous utilities; a combinatorial polynomial time algorithm for this case was given in Vazirani and Yannakakis (in: Innovations in theoretical computer science, pp 59–15919, 2021). The <span>(epsilon)</span>-approximate ADHZ model fills a void in the space of general mechanisms for one-sided matching markets; see details in the paper.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"38 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141948434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sándor P. Fekete, Peter Kramer, Christian Rieck, Christian Scheffer, Arne Schmidt
{"title":"Efficiently reconfiguring a connected swarm of labeled robots","authors":"Sándor P. Fekete, Peter Kramer, Christian Rieck, Christian Scheffer, Arne Schmidt","doi":"10.1007/s10458-024-09668-3","DOIUrl":"10.1007/s10458-024-09668-3","url":null,"abstract":"<div><p>When considering motion planning for a swarm of <i>n</i> labeled robots, we need to rearrange a given start configuration into a desired target configuration via a sequence of parallel, collision-free moves. The objective is to reach the new configuration in a minimum amount of time. Problems of this type have been considered before, with recent notable results achieving <i>constant stretch</i> for parallel reconfiguration: If mapping the start configuration to the target configuration requires a maximum Manhattan distance of <i>d</i>, the total duration of an overall schedule can be bounded to <span>(mathcal {O}(d))</span>, which is optimal up to constant factors. An important constraint for coordinated reconfiguration is to keep the swarm connected after each time step. In previous work, constant stretch could only be achieved if <i>disconnected</i> reconfiguration is allowed, or for scaled configurations of <i>unlabeled</i> robots; on the other hand, the existence of non-constant lower bounds on the stretch factor was unknown. We resolve these major open problems by (1) establishing a lower bound of <span>(Omega (sqrt{n}))</span> for connected, labeled reconfiguration and, most importantly, by (2) proving that for scaled arrangements, constant stretch for connected, labeled reconfiguration can be achieved. In addition, we show that (3) it is <span>NP</span>-complete to decide whether a makespan of 2 can be achieved, while it is possible to check in polynomial time whether a schedule of makespan 1 exists.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"38 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-024-09668-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141948435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Carbon trading supply chain management based on constrained deep reinforcement learning","authors":"Qinghao Wang, Yaodong Yang","doi":"10.1007/s10458-024-09669-2","DOIUrl":"10.1007/s10458-024-09669-2","url":null,"abstract":"<div><p>The issue of carbon emissions is a critical global concern, and how to effectively reduce energy consumption and emissions is a challenge faced by the industrial sector, which is highly emphasized in supply chain management. The complexity arises from the intricate coupling mechanism between carbon trading and ordering. T he large-scale state space involved and various constraints make cost optimization difficult. Carbon quota constraints and sequential decision-making exacerbate the challenges for businesses. Existing research implements rule-based and heuristic numerical simulation, which struggles to adapt to time-varying environments. We develop a unified framework from the perspective of Constrained Markov Decision Processes (CMDP). Constrained Deep Reinforcement Learning (DRL) with its powerful high-dimensional representations of neural networks and effective decision-making capabilities under constraints, provides a potential solution for supply chain management that includes carbon trading. DRL with constraints is a crucial tool to study cost optimization for enterprises. This paper constructs a DRL algorithm for Double Order based on PPO-Lagrangian (DOPPOL), aimed at addressing a supply chain management model that integrates carbon trading decisions and ordering decisions. The results indicate that businesses can optimize both business and carbon costs, thereby increasing overall profits, as well as adapt to various demand uncertainties. DOPPOL outperforms the traditional method (<i>s</i>, <i>S</i>) in fluctuating demand scenarios. By introducing carbon trading, enterprises are able to adjust supply chain orders and carbon emissions through interaction, and improve operational efficiency. Finally, we emphasize the significant role of carbon pricing in enterprise contracts in terms of profitability, as reasonable prices can help control carbon emissions and reduce costs. Our research is of great importance in achieving climate change control, as well as promoting sustainability.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"38 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141948436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Envy-freeness in 3D hedonic games","authors":"Michael McKay, Ágnes Cseh, David Manlove","doi":"10.1007/s10458-024-09657-6","DOIUrl":"10.1007/s10458-024-09657-6","url":null,"abstract":"<div><p>We study the problem of fairly partitioning a set of agents into coalitions based on the agents’ additively separable preferences, which can also be viewed as a hedonic game. We study three successively weaker solution concepts, related to envy, weakly justified envy, and justified envy. In a model in which coalitions may have any size, trivial solutions exist for these concepts, which provides a strong motivation for placing restrictions on coalition size. In this paper, we require feasible coalitions to have size three. We study the existence of partitions that are envy-free, weakly justified envy-free, and justified envy-free, and the computational complexity of finding such partitions, if they exist. We impose various restrictions on the agents’ preferences and present a complete complexity classification in terms of these restrictions.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"38 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-024-09657-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141863957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ming Yang, Kaiyan Zhao, Yiming Wang, Renzhi Dong, Yali Du, Furui Liu, Mingliang Zhou, Leong Hou U
{"title":"Team-wise effective communication in multi-agent reinforcement learning","authors":"Ming Yang, Kaiyan Zhao, Yiming Wang, Renzhi Dong, Yali Du, Furui Liu, Mingliang Zhou, Leong Hou U","doi":"10.1007/s10458-024-09665-6","DOIUrl":"10.1007/s10458-024-09665-6","url":null,"abstract":"<div><p>Effective communication is crucial for the success of multi-agent systems, as it promotes collaboration for attaining joint objectives and enhances competitive efforts towards individual goals. In the context of multi-agent reinforcement learning, determining “whom”, “how” and “what” to communicate are crucial factors for developing effective policies. Therefore, we propose TeamComm, a novel framework for multi-agent communication reinforcement learning. First, it introduces a dynamic team reasoning policy, allowing agents to dynamically form teams and adapt their communication partners based on task requirements and environment states in cooperative or competitive scenarios. Second, TeamComm utilizes heterogeneous communication channels consisting of intra- and inter-team to achieve diverse information flow. Lastly, TeamComm leverages the information bottleneck principle to optimize communication content, guiding agents to convey relevant and valuable information. Through experimental evaluations on three popular environments with seven different scenarios, we empirically demonstrate the superior performance of TeamComm compared to existing methods.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"38 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141741489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Edmond Awad, Sydney Levine, Andrea Loreggia, Nicholas Mattei, Iyad Rahwan, Francesca Rossi, Kartik Talamadupula, Joshua Tenenbaum, Max Kleiman-Weiner
{"title":"When is it acceptable to break the rules? Knowledge representation of moral judgements based on empirical data","authors":"Edmond Awad, Sydney Levine, Andrea Loreggia, Nicholas Mattei, Iyad Rahwan, Francesca Rossi, Kartik Talamadupula, Joshua Tenenbaum, Max Kleiman-Weiner","doi":"10.1007/s10458-024-09667-4","DOIUrl":"10.1007/s10458-024-09667-4","url":null,"abstract":"<div><p>Constraining the actions of AI systems is one promising way to ensure that these systems behave in a way that is morally acceptable to humans. But constraints alone come with drawbacks as in many AI systems, they are not flexible. If these constraints are too rigid, they can preclude actions that are actually acceptable in certain, contextual situations. Humans, on the other hand, can often decide when a simple and seemingly inflexible rule should actually be overridden based on the context. In this paper, we empirically investigate the way humans make these contextual moral judgements, with the goal of building AI systems that understand when to follow and when to override constraints. We propose a novel and general preference-based graphical model that captures a modification of standard <i>dual process</i> theories of moral judgment. We then detail the design, implementation, and results of a study of human participants who judge whether it is acceptable to break a well-established rule: <i>no cutting in line</i>. We then develop an instance of our model and compare its performance to that of standard machine learning approaches on the task of predicting the behavior of human participants in the study, showing that our preference-based approach more accurately captures the judgments of human decision-makers. It also provides a flexible method to model the relationship between variables for moral decision-making tasks that can be generalized to other settings.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"38 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-024-09667-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomy Phan, Felix Sommer, Fabian Ritz, Philipp Altmann, Jonas Nüßlein, Michael Kölle, Lenz Belzner, Claudia Linnhoff-Popien
{"title":"Emergent cooperation from mutual acknowledgment exchange in multi-agent reinforcement learning","authors":"Thomy Phan, Felix Sommer, Fabian Ritz, Philipp Altmann, Jonas Nüßlein, Michael Kölle, Lenz Belzner, Claudia Linnhoff-Popien","doi":"10.1007/s10458-024-09666-5","DOIUrl":"10.1007/s10458-024-09666-5","url":null,"abstract":"<div><p><i>Peer incentivization (PI)</i> is a recent approach where all agents learn to reward or penalize each other in a distributed fashion, which often leads to emergent cooperation. Current PI mechanisms implicitly assume a flawless communication channel in order to exchange rewards. These rewards are directly incorporated into the learning process without any chance to respond with feedback. Furthermore, most PI approaches rely on global information, which limits scalability and applicability to real-world scenarios where only local information is accessible. In this paper, we propose <i>Mutual Acknowledgment Token Exchange (MATE)</i>, a PI approach defined by a two-phase communication protocol to exchange acknowledgment tokens as incentives to shape individual rewards mutually. All agents condition their token transmissions on the locally estimated quality of their own situations based on environmental rewards and received tokens. MATE is completely decentralized and only requires local communication and information. We evaluate MATE in three social dilemma domains. Our results show that MATE is able to achieve and maintain significantly higher levels of cooperation than previous PI approaches. In addition, we evaluate the robustness of MATE in more realistic scenarios, where agents can deviate from the protocol and communication failures can occur. We also evaluate the sensitivity of MATE w.r.t. the choice of token values.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"38 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-024-09666-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141588186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An agent-based persuasion model using emotion-driven concession and multi-objective optimization","authors":"Zhenwu Wang, Jiayin Shen, Xiaosong Tang, Mengjie Han, Zhenhua Feng, Jinghua Wu","doi":"10.1007/s10458-024-09664-7","DOIUrl":"10.1007/s10458-024-09664-7","url":null,"abstract":"<div><p>Multi-attribute negotiation is essentially a multi-objective optimization (MOO) problem, where models of agent-based emotional persuasion (EP) can exhibit characteristics of anthropomorphism. This paper proposes a novel EP model by fusing the strategy of emotion-driven concession with the method of multi-objective optimization (EDC-MOO). Firstly, a comprehensive emotion model is designed to enhance the authenticity of the emotion. A novel concession strategy is then proposed to enable the concession to be dynamically tuned by the emotions of the agents. Finally, a new EP model is constructed by integrating emotion, historical transaction, persuasion behavior, and concession strategy under the framework of MOO. Comprehensive experiments on bilateral negotiation are conducted to illustrate and validate the effectiveness of EDC-MOO. These include an analysis of negotiations under five distinct persuasion styles, a comparison of EDC-MOO with a non-emotion-based MOO negotiation model and classic trade-off strategies, negotiations between emotion-driven and non-emotion-driven agents, and negotiations involving human participants. A detailed analysis of parameter sensitivity is also discussed. Experimental results show that the proposed EDC-MOO model can enhance the diversity of the negotiation process and the anthropomorphism of the bilateral agents, thereby improving the social welfare of both parties.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"38 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-024-09664-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141571666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Agiollo, Luciano Cavalcante Siebert, Pradeep K. Murukannaiah, Andrea Omicini
{"title":"From large language models to small logic programs: building global explanations from disagreeing local post-hoc explainers","authors":"Andrea Agiollo, Luciano Cavalcante Siebert, Pradeep K. Murukannaiah, Andrea Omicini","doi":"10.1007/s10458-024-09663-8","DOIUrl":"10.1007/s10458-024-09663-8","url":null,"abstract":"<div><p>The expressive power and effectiveness of <i>large language models</i> (LLMs) is going to increasingly push intelligent agents towards sub-symbolic models for natural language processing (NLP) tasks in human–agent interaction. However, LLMs are characterised by a performance vs. transparency trade-off that hinders their applicability to such sensitive scenarios. This is the main reason behind many approaches focusing on <i>local</i> post-hoc explanations, recently proposed by the XAI community in the NLP realm. However, to the best of our knowledge, a thorough comparison among available explainability techniques is currently missing, as well as approaches for constructing <i>global</i> post-hoc explanations leveraging the local information. This is why we propose a novel framework for comparing state-of-the-art local post-hoc explanation mechanisms and for extracting logic programs surrogating LLMs. Our experiments—over a wide variety of text classification tasks—show how most local post-hoc explainers are loosely correlated, highlighting substantial discrepancies in their results. By relying on the proposed novel framework, we also show how it is possible to extract faithful and efficient global explanations for the original LLM over multiple tasks, enabling explainable and resource-friendly AI techniques.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"38 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-024-09663-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141571665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}