Double mixing networks based monotonic value function decomposition algorithm for swarm intelligence in UAVs

IF 2.6 3区计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS

Autonomous Agents and Multi-Agent Systems Pub Date : 2025-03-05 DOI:10.1007/s10458-025-09700-0

Pingping Qu, Chenglong He, Xiaotong Wu, Ershen Wang, Song Xu, Huan Liu, Xinhui Sun

{"title":"Double mixing networks based monotonic value function decomposition algorithm for swarm intelligence in UAVs","authors":"Pingping Qu, Chenglong He, Xiaotong Wu, Ershen Wang, Song Xu, Huan Liu, Xinhui Sun","doi":"10.1007/s10458-025-09700-0","DOIUrl":null,"url":null,"abstract":"<div><p>In multi-agent systems, particularly when facing challenges of partial observability, reinforcement learning demonstrates significant autonomous decision-making capabilities. Aiming at addressing resource allocation and collaboration issues in drone swarms operating in dynamic and unknown environments, we propose a novel deep reinforcement learning algorithm, DQMIX. We employ a framework of centralized training with decentralized execution and incorporate a partially observable Markov game model to describe the complex game environment of drone swarms. The core innovation of the DQMIX algorithm lies in its dual-mixing network structure and soft-switching mechanism. Two independent mixing networks handle local Q-values and synthesize them into a global Q-value. This structure enhances decision accuracy and system adaptability under different scenarios and data conditions. The soft-switching module allows the system to transition smoothly between the two networks, selecting the output of the network with smaller TD-errors to enhance decision stability and coherence. Simultaneously, we introduce Hindsight Experience Replay to learn from failed experiences. Experimental results using JSBSim demonstrate that DQMIX provides an effective solution for drone swarm game problems, especially in resource allocation and adversarial environments.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"39 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Autonomous Agents and Multi-Agent Systems","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10458-025-09700-0","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

In multi-agent systems, particularly when facing challenges of partial observability, reinforcement learning demonstrates significant autonomous decision-making capabilities. Aiming at addressing resource allocation and collaboration issues in drone swarms operating in dynamic and unknown environments, we propose a novel deep reinforcement learning algorithm, DQMIX. We employ a framework of centralized training with decentralized execution and incorporate a partially observable Markov game model to describe the complex game environment of drone swarms. The core innovation of the DQMIX algorithm lies in its dual-mixing network structure and soft-switching mechanism. Two independent mixing networks handle local Q-values and synthesize them into a global Q-value. This structure enhances decision accuracy and system adaptability under different scenarios and data conditions. The soft-switching module allows the system to transition smoothly between the two networks, selecting the output of the network with smaller TD-errors to enhance decision stability and coherence. Simultaneously, we introduce Hindsight Experience Replay to learn from failed experiences. Experimental results using JSBSim demonstrate that DQMIX provides an effective solution for drone swarm game problems, especially in resource allocation and adversarial environments.

Abstract Image

查看原文本刊更多论文

基于双混合网络的无人机群智能单调值函数分解算法

在多智能体系统中，特别是面对部分可观察性的挑战时，强化学习显示出显著的自主决策能力。针对动态未知环境下无人机群的资源分配和协作问题，提出了一种新的深度强化学习算法DQMIX。我们采用集中训练和分散执行的框架，并结合部分可观察的马尔可夫博弈模型来描述无人机群的复杂博弈环境。DQMIX算法的核心创新在于双混合网络结构和软交换机制。两个独立的混合网络处理局部q值并将其合成为一个全局q值。这种结构提高了决策精度和系统在不同场景和数据条件下的适应性。软交换模块允许系统在两个网络之间平滑过渡，选择td误差较小的网络输出，以增强决策稳定性和相干性。同时，我们引入后见之明的经验重播，从失败的经验中学习。基于JSBSim的实验结果表明，DQMIX可以有效地解决无人机群博弈问题，特别是在资源分配和对抗环境下。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Autonomous Agents and Multi-Agent Systems 工程技术-计算机：人工智能

CiteScore

6.00

自引率

5.30%

发文量

审稿时长

>12 weeks

期刊介绍： This is the official journal of the International Foundation for Autonomous Agents and Multi-Agent Systems. It provides a leading forum for disseminating significant original research results in the foundations, theory, development, analysis, and applications of autonomous agents and multi-agent systems. Coverage in Autonomous Agents and Multi-Agent Systems includes, but is not limited to: Agent decision-making architectures and their evaluation, including: cognitive models; knowledge representation; logics for agency; ontological reasoning; planning (single and multi-agent); reasoning (single and multi-agent) Cooperation and teamwork, including: distributed problem solving; human-robot/agent interaction; multi-user/multi-virtual-agent interaction; coalition formation; coordination Agent communication languages, including: their semantics, pragmatics, and implementation; agent communication protocols and conversations; agent commitments; speech act theory Ontologies for agent systems, agents and the semantic web, agents and semantic web services, Grid-based systems, and service-oriented computing Agent societies and societal issues, including: artificial social systems; environments, organizations and institutions; ethical and legal issues; privacy, safety and security; trust, reliability and reputation Agent-based system development, including: agent development techniques, tools and environments; agent programming languages; agent specification or validation languages Agent-based simulation, including: emergent behavior; participatory simulation; simulation techniques, tools and environments; social simulation Agreement technologies, including: argumentation; collective decision making; judgment aggregation and belief merging; negotiation; norms Economic paradigms, including: auction and mechanism design; bargaining and negotiation; economically-motivated agents; game theory (cooperative and non-cooperative); social choice and voting Learning agents, including: computational architectures for learning agents; evolution, adaptation; multi-agent learning. Robotic agents, including: integrated perception, cognition, and action; cognitive robotics; robot planning (including action and motion planning); multi-robot systems. Virtual agents, including: agents in games and virtual environments; companion and coaching agents; modeling personality, emotions; multimodal interaction; verbal and non-verbal expressiveness Significant, novel applications of agent technology Comprehensive reviews and authoritative tutorials of research and practice in agent systems Comprehensive and authoritative reviews of books dealing with agents and multi-agent systems.