Entropy based blending of policies for multi-agent coexistence

IF 2 3区计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS

Autonomous Agents and Multi-Agent Systems Pub Date : 2025-05-16 DOI:10.1007/s10458-025-09707-7

David Rother, Franziska Herbert, Fabian Kalter, Dorothea Koert, Joni Pajarinen, Jan Peters, Thomas H. Weisswange

{"title":"Entropy based blending of policies for multi-agent coexistence","authors":"David Rother, Franziska Herbert, Fabian Kalter, Dorothea Koert, Joni Pajarinen, Jan Peters, Thomas H. Weisswange","doi":"10.1007/s10458-025-09707-7","DOIUrl":null,"url":null,"abstract":"<div><p>Research on multi-agent interaction involving humans is still in its infancy. Most approaches have focused on environments with collaborative human behavior or a small, defined set of situations. When deploying robots in human-inhabited environments in the future, the diversity of interactions surpasses the capabilities of pre-trained collaboration models. ”Coexistence” environments, characterized by agents with varying or partially aligned objectives, present a unique challenge for robotic collaboration. Traditional reinforcement learning methods fall short in these settings. These approaches lack the flexibility to adapt to changing agent counts or task requirements without undergoing retraining. Moreover, existing models do not adequately support scenarios where robots should exhibit helpful behavior toward others without compromising their primary goals. To tackle this issue, we introduce a novel framework that decomposes interaction and task-solving into separate learning problems and blends the resulting policies at inference time using a goal inference model for task estimation. We create impact-aware agents and linearly scale the cost of training agents with the number of agents and available tasks. To this end, a weighting function blending action distributions for individual interactions with the original task action distribution is proposed. To support our claims we demonstrate that our framework scales in task and agent count across several environments and considers collaboration opportunities when present. The new learning paradigm opens the path to more complex multi-robot, multi-human interactions.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-025-09707-7.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Autonomous Agents and Multi-Agent Systems","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10458-025-09707-7","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Research on multi-agent interaction involving humans is still in its infancy. Most approaches have focused on environments with collaborative human behavior or a small, defined set of situations. When deploying robots in human-inhabited environments in the future, the diversity of interactions surpasses the capabilities of pre-trained collaboration models. ”Coexistence” environments, characterized by agents with varying or partially aligned objectives, present a unique challenge for robotic collaboration. Traditional reinforcement learning methods fall short in these settings. These approaches lack the flexibility to adapt to changing agent counts or task requirements without undergoing retraining. Moreover, existing models do not adequately support scenarios where robots should exhibit helpful behavior toward others without compromising their primary goals. To tackle this issue, we introduce a novel framework that decomposes interaction and task-solving into separate learning problems and blends the resulting policies at inference time using a goal inference model for task estimation. We create impact-aware agents and linearly scale the cost of training agents with the number of agents and available tasks. To this end, a weighting function blending action distributions for individual interactions with the original task action distribution is proposed. To support our claims we demonstrate that our framework scales in task and agent count across several environments and considers collaboration opportunities when present. The new learning paradigm opens the path to more complex multi-robot, multi-human interactions.

查看原文本刊更多论文

基于熵的多智能体共存策略混合

涉及人类的多智能体交互研究仍处于起步阶段。大多数方法都集中在具有协作性人类行为的环境或一组小的、已定义的情况。当未来在人类居住的环境中部署机器人时，交互的多样性超过了预先训练的协作模型的能力。”共存环境以具有不同或部分一致目标的代理为特征，为机器人协作提出了独特的挑战。传统的强化学习方法在这些情况下是不够的。这些方法缺乏灵活性，无法在不进行再培训的情况下适应不断变化的代理数量或任务需求。此外，现有的模型并不能充分支持机器人在不损害其主要目标的情况下对他人表现出帮助行为的场景。为了解决这个问题，我们引入了一个新的框架，该框架将交互和任务解决分解为单独的学习问题，并使用任务估计的目标推理模型在推理时混合结果策略。我们创建影响感知代理，并根据代理和可用任务的数量线性缩放培训代理的成本。为此，提出了一种混合个体交互动作分布与原始任务动作分布的加权函数。为了支持我们的说法，我们证明了我们的框架可以在多个环境中扩展任务和代理数量，并在存在时考虑协作机会。新的学习范式为更复杂的多机器人、多人类互动开辟了道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Autonomous Agents and Multi-Agent Systems 工程技术-计算机：人工智能

CiteScore

6.00

自引率

5.30%

发文量

审稿时长

>12 weeks

期刊介绍： This is the official journal of the International Foundation for Autonomous Agents and Multi-Agent Systems. It provides a leading forum for disseminating significant original research results in the foundations, theory, development, analysis, and applications of autonomous agents and multi-agent systems. Coverage in Autonomous Agents and Multi-Agent Systems includes, but is not limited to: Agent decision-making architectures and their evaluation, including: cognitive models; knowledge representation; logics for agency; ontological reasoning; planning (single and multi-agent); reasoning (single and multi-agent) Cooperation and teamwork, including: distributed problem solving; human-robot/agent interaction; multi-user/multi-virtual-agent interaction; coalition formation; coordination Agent communication languages, including: their semantics, pragmatics, and implementation; agent communication protocols and conversations; agent commitments; speech act theory Ontologies for agent systems, agents and the semantic web, agents and semantic web services, Grid-based systems, and service-oriented computing Agent societies and societal issues, including: artificial social systems; environments, organizations and institutions; ethical and legal issues; privacy, safety and security; trust, reliability and reputation Agent-based system development, including: agent development techniques, tools and environments; agent programming languages; agent specification or validation languages Agent-based simulation, including: emergent behavior; participatory simulation; simulation techniques, tools and environments; social simulation Agreement technologies, including: argumentation; collective decision making; judgment aggregation and belief merging; negotiation; norms Economic paradigms, including: auction and mechanism design; bargaining and negotiation; economically-motivated agents; game theory (cooperative and non-cooperative); social choice and voting Learning agents, including: computational architectures for learning agents; evolution, adaptation; multi-agent learning. Robotic agents, including: integrated perception, cognition, and action; cognitive robotics; robot planning (including action and motion planning); multi-robot systems. Virtual agents, including: agents in games and virtual environments; companion and coaching agents; modeling personality, emotions; multimodal interaction; verbal and non-verbal expressiveness Significant, novel applications of agent technology Comprehensive reviews and authoritative tutorials of research and practice in agent systems Comprehensive and authoritative reviews of books dealing with agents and multi-agent systems.