Joint Optimal Allocation of Resources for Multiple Jammer Based on Multi-Agent Deep Reinforcement Learning

IF 1.4 4区 管理学 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC
Jieling Wang, Yanfei Liu, Chao Li, Zhong Wang, Yali Li
{"title":"Joint Optimal Allocation of Resources for Multiple Jammer Based on Multi-Agent Deep Reinforcement Learning","authors":"Jieling Wang,&nbsp;Yanfei Liu,&nbsp;Chao Li,&nbsp;Zhong Wang,&nbsp;Yali Li","doi":"10.1049/rsn2.70031","DOIUrl":null,"url":null,"abstract":"<p>In response to the complex scenario where multiple jammers navigate through a netted radar system (NRS), this study presents an optimised allocation algorithm for cooperative jamming resources, namely the Multi-Agent Jamming Resource Allocation (MJCJRA) algorithm, which is based on multi-agent deep reinforcement learning. Initially, the research develops a target fusion detection probability function and a global performance index optimisation function, which are tailored to the specific jamming and radar detection models of the scenario. Subsequently, the multiple jammers are mapped into a multi-agent system with a greedy strategy employed to generate targeted rewards for the jamming agents, enhancing their learning efficiency and performance. The study culminates in the design of evaluation and mixed-strategy networks for the jamming agents. It utilises an exponential mean shift method for soft updates of the target network, adopts priority experience replay and importance sampling methods, and incorporates reward centring into the loss function for network updates. Experimental findings demonstrate that MJCJRA algorithm significantly surpasses the baseline method, the particle swarm optimisation (PSO), the snow ablation optimiser (SAO), the multi-agent deep deterministic policy gradient (MADDPG) and multi-agent proximal policy optimisation (MAPPO), effectively diminishing the detection capability of NRS.</p>","PeriodicalId":50377,"journal":{"name":"Iet Radar Sonar and Navigation","volume":"19 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2025-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/rsn2.70031","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Iet Radar Sonar and Navigation","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/rsn2.70031","RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

In response to the complex scenario where multiple jammers navigate through a netted radar system (NRS), this study presents an optimised allocation algorithm for cooperative jamming resources, namely the Multi-Agent Jamming Resource Allocation (MJCJRA) algorithm, which is based on multi-agent deep reinforcement learning. Initially, the research develops a target fusion detection probability function and a global performance index optimisation function, which are tailored to the specific jamming and radar detection models of the scenario. Subsequently, the multiple jammers are mapped into a multi-agent system with a greedy strategy employed to generate targeted rewards for the jamming agents, enhancing their learning efficiency and performance. The study culminates in the design of evaluation and mixed-strategy networks for the jamming agents. It utilises an exponential mean shift method for soft updates of the target network, adopts priority experience replay and importance sampling methods, and incorporates reward centring into the loss function for network updates. Experimental findings demonstrate that MJCJRA algorithm significantly surpasses the baseline method, the particle swarm optimisation (PSO), the snow ablation optimiser (SAO), the multi-agent deep deterministic policy gradient (MADDPG) and multi-agent proximal policy optimisation (MAPPO), effectively diminishing the detection capability of NRS.

Abstract Image

基于多智能体深度强化学习的多干扰机联合资源优化分配
针对多干扰机在网状雷达系统(NRS)中导航的复杂情况,本研究提出了一种优化的协同干扰资源分配算法,即基于多智能体深度强化学习的多智能体干扰资源分配(MJCJRA)算法。首先,研究开发了目标融合检测概率函数和全局性能指标优化函数,针对场景的特定干扰和雷达检测模型进行了定制。随后,将多个干扰者映射到一个多智能体系统中,并采用贪婪策略对干扰者产生有针对性的奖励,提高了干扰者的学习效率和性能。研究的最终成果是设计了干扰剂的评估和混合策略网络。采用指数均值移位法对目标网络进行软更新,采用优先级经验重放和重要性抽样方法,并将奖励中心法引入网络更新损失函数。实验结果表明,MJCJRA算法显著优于基线方法、粒子群优化(PSO)、积雪消融优化(SAO)、多智能体深度确定性策略梯度(MADDPG)和多智能体近端策略优化(MAPPO),有效降低了NRS的检测能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Iet Radar Sonar and Navigation
Iet Radar Sonar and Navigation 工程技术-电信学
CiteScore
4.10
自引率
11.80%
发文量
137
审稿时长
3.4 months
期刊介绍: IET Radar, Sonar & Navigation covers the theory and practice of systems and signals for radar, sonar, radiolocation, navigation, and surveillance purposes, in aerospace and terrestrial applications. Examples include advances in waveform design, clutter and detection, electronic warfare, adaptive array and superresolution methods, tracking algorithms, synthetic aperture, and target recognition techniques.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信