Joint Optimal Allocation of Resources for Multiple Jammer Based on Multi-Agent Deep Reinforcement Learning

IF 1.5 4区管理学 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Iet Radar Sonar and Navigation Pub Date : 2025-05-04 DOI:10.1049/rsn2.70031

Jieling Wang, Yanfei Liu, Chao Li, Zhong Wang, Yali Li

{"title":"Joint Optimal Allocation of Resources for Multiple Jammer Based on Multi-Agent Deep Reinforcement Learning","authors":"Jieling Wang, Yanfei Liu, Chao Li, Zhong Wang, Yali Li","doi":"10.1049/rsn2.70031","DOIUrl":null,"url":null,"abstract":"<p>In response to the complex scenario where multiple jammers navigate through a netted radar system (NRS), this study presents an optimised allocation algorithm for cooperative jamming resources, namely the Multi-Agent Jamming Resource Allocation (MJCJRA) algorithm, which is based on multi-agent deep reinforcement learning. Initially, the research develops a target fusion detection probability function and a global performance index optimisation function, which are tailored to the specific jamming and radar detection models of the scenario. Subsequently, the multiple jammers are mapped into a multi-agent system with a greedy strategy employed to generate targeted rewards for the jamming agents, enhancing their learning efficiency and performance. The study culminates in the design of evaluation and mixed-strategy networks for the jamming agents. It utilises an exponential mean shift method for soft updates of the target network, adopts priority experience replay and importance sampling methods, and incorporates reward centring into the loss function for network updates. Experimental findings demonstrate that MJCJRA algorithm significantly surpasses the baseline method, the particle swarm optimisation (PSO), the snow ablation optimiser (SAO), the multi-agent deep deterministic policy gradient (MADDPG) and multi-agent proximal policy optimisation (MAPPO), effectively diminishing the detection capability of NRS.</p>","PeriodicalId":50377,"journal":{"name":"Iet Radar Sonar and Navigation","volume":"19 1","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/rsn2.70031","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Iet Radar Sonar and Navigation","FirstCategoryId":"94","ListUrlMain":"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/rsn2.70031","RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

In response to the complex scenario where multiple jammers navigate through a netted radar system (NRS), this study presents an optimised allocation algorithm for cooperative jamming resources, namely the Multi-Agent Jamming Resource Allocation (MJCJRA) algorithm, which is based on multi-agent deep reinforcement learning. Initially, the research develops a target fusion detection probability function and a global performance index optimisation function, which are tailored to the specific jamming and radar detection models of the scenario. Subsequently, the multiple jammers are mapped into a multi-agent system with a greedy strategy employed to generate targeted rewards for the jamming agents, enhancing their learning efficiency and performance. The study culminates in the design of evaluation and mixed-strategy networks for the jamming agents. It utilises an exponential mean shift method for soft updates of the target network, adopts priority experience replay and importance sampling methods, and incorporates reward centring into the loss function for network updates. Experimental findings demonstrate that MJCJRA algorithm significantly surpasses the baseline method, the particle swarm optimisation (PSO), the snow ablation optimiser (SAO), the multi-agent deep deterministic policy gradient (MADDPG) and multi-agent proximal policy optimisation (MAPPO), effectively diminishing the detection capability of NRS.

Abstract Image

查看原文本刊更多论文

基于多智能体深度强化学习的多干扰机联合资源优化分配

针对多干扰机在网状雷达系统（NRS）中导航的复杂情况，本研究提出了一种优化的协同干扰资源分配算法，即基于多智能体深度强化学习的多智能体干扰资源分配（MJCJRA）算法。首先，研究开发了目标融合检测概率函数和全局性能指标优化函数，针对场景的特定干扰和雷达检测模型进行了定制。随后，将多个干扰者映射到一个多智能体系统中，并采用贪婪策略对干扰者产生有针对性的奖励，提高了干扰者的学习效率和性能。研究的最终成果是设计了干扰剂的评估和混合策略网络。采用指数均值移位法对目标网络进行软更新，采用优先级经验重放和重要性抽样方法，并将奖励中心法引入网络更新损失函数。实验结果表明，MJCJRA算法显著优于基线方法、粒子群优化（PSO）、积雪消融优化（SAO）、多智能体深度确定性策略梯度（MADDPG）和多智能体近端策略优化（MAPPO），有效降低了NRS的检测能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Iet Radar Sonar and Navigation 工程技术-电信学

CiteScore

4.10

自引率

11.80%

发文量

137

审稿时长

3.4 months

期刊介绍： IET Radar, Sonar & Navigation covers the theory and practice of systems and signals for radar, sonar, radiolocation, navigation, and surveillance purposes, in aerospace and terrestrial applications. Examples include advances in waveform design, clutter and detection, electronic warfare, adaptive array and superresolution methods, tracking algorithms, synthetic aperture, and target recognition techniques.