基于多智能体离散软角色评价算法的多用户协同抗干扰策略

IF 8 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

IEEE Transactions on Information Forensics and Security Pub Date : 2025-03-14 DOI:10.1109/TIFS.2025.3570160

Xiaorong Jing;Rui Wang;Hongjiang Lei;Hongqing Liu;Qianbin Chen

{"title":"基于多智能体离散软角色评价算法的多用户协同抗干扰策略","authors":"Xiaorong Jing;Rui Wang;Hongjiang Lei;Hongqing Liu;Qianbin Chen","doi":"10.1109/TIFS.2025.3570160","DOIUrl":null,"url":null,"abstract":"In multi-user adversarial scenarios involving external malicious jamming and internal co-channel interference, environmental instability and increased decision-making dimensions cause traditional deep reinforcement learning (DRL)-based anti-jamming schemes to suffer from insufficient exploration. Agents must choose policies from a large action set, leading to a significant decline in anti-jamming performance. To address these issues, this paper proposes a multi-agent discrete soft actor-critic (MA-DSAC) algorithm-based collaborative anti-jamming strategy, integrating frequency, power, and modulation-coding domains. This strategy first introduces a Markov game to model and analyze the multi-user anti-jamming problem. Next, the soft actor-critic (SAC) algorithm is discretized to handle the multi-dimensional discrete action space. Finally, through information exchange between communication transceivers and based on a centralized training with decentralized execution (CTDE) framework, it is extended to a multi-agent DRL algorithm to achieve efficient multi-user cooperative anti-jamming. Simulation results show that in various anti-jamming scenarios with both fixed-mode and intelligent jammers, the proposed anti-jamming strategy’s performance improves by more than 25% compared to traditional value-based DRL strategies, including independent deep Q-network (I-DQN) and multi-agent virtual exploration in deep Q-learning (MA-VEDQL). Furthermore, through information exchange between communication transceivers, the instability problem of multi-agent DRL is effectively alleviated, enabling the communication transceivers to balance competition and cooperation. Consequently, its anti-jamming performance improves by more than 6% compared to the independent DSAC (I-DSAC) strategy.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"5025-5038"},"PeriodicalIF":8.0000,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Agent Discrete Soft Actor-Critic Algorithm-Based Multi-User Collaborative Anti-Jamming Strategy\",\"authors\":\"Xiaorong Jing;Rui Wang;Hongjiang Lei;Hongqing Liu;Qianbin Chen\",\"doi\":\"10.1109/TIFS.2025.3570160\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In multi-user adversarial scenarios involving external malicious jamming and internal co-channel interference, environmental instability and increased decision-making dimensions cause traditional deep reinforcement learning (DRL)-based anti-jamming schemes to suffer from insufficient exploration. Agents must choose policies from a large action set, leading to a significant decline in anti-jamming performance. To address these issues, this paper proposes a multi-agent discrete soft actor-critic (MA-DSAC) algorithm-based collaborative anti-jamming strategy, integrating frequency, power, and modulation-coding domains. This strategy first introduces a Markov game to model and analyze the multi-user anti-jamming problem. Next, the soft actor-critic (SAC) algorithm is discretized to handle the multi-dimensional discrete action space. Finally, through information exchange between communication transceivers and based on a centralized training with decentralized execution (CTDE) framework, it is extended to a multi-agent DRL algorithm to achieve efficient multi-user cooperative anti-jamming. Simulation results show that in various anti-jamming scenarios with both fixed-mode and intelligent jammers, the proposed anti-jamming strategy’s performance improves by more than 25% compared to traditional value-based DRL strategies, including independent deep Q-network (I-DQN) and multi-agent virtual exploration in deep Q-learning (MA-VEDQL). Furthermore, through information exchange between communication transceivers, the instability problem of multi-agent DRL is effectively alleviated, enabling the communication transceivers to balance competition and cooperation. Consequently, its anti-jamming performance improves by more than 6% compared to the independent DSAC (I-DSAC) strategy.\",\"PeriodicalId\":13492,\"journal\":{\"name\":\"IEEE Transactions on Information Forensics and Security\",\"volume\":\"20 \",\"pages\":\"5025-5038\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-03-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Forensics and Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11003932/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11003932/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

在涉及外部恶意干扰和内部同信道干扰的多用户对抗场景中，环境不稳定性和决策维度的增加导致基于深度强化学习（DRL）的传统抗干扰方案探索不足。agent必须从一个大的动作集中选择策略，导致抗干扰性能明显下降。为了解决这些问题，本文提出了一种基于多智能体离散软行为者批评（MA-DSAC）算法的协同抗干扰策略，集成了频率、功率和调制编码域。该策略首先引入马尔可夫对策来建模和分析多用户抗干扰问题。其次，对软行为者评价（SAC）算法进行离散化，处理多维离散动作空间。最后，通过通信收发器之间的信息交换，基于集中训练分散执行（CTDE）框架，将其扩展为多智能体DRL算法，实现高效的多用户协同抗干扰。仿真结果表明，在固定模式和智能干扰器的各种抗干扰场景中，与传统的基于值的DRL策略（包括独立深度q网络（I-DQN）和深度q学习中的多智能体虚拟探索（MA-VEDQL））相比，所提出的抗干扰策略的性能提高了25%以上。此外，通过通信收发器之间的信息交换，有效缓解了多智能体DRL的不稳定性问题，使通信收发器能够平衡竞争与合作。因此，与独立DSAC （I-DSAC）策略相比，其抗干扰性能提高了6%以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-Agent Discrete Soft Actor-Critic Algorithm-Based Multi-User Collaborative Anti-Jamming Strategy

In multi-user adversarial scenarios involving external malicious jamming and internal co-channel interference, environmental instability and increased decision-making dimensions cause traditional deep reinforcement learning (DRL)-based anti-jamming schemes to suffer from insufficient exploration. Agents must choose policies from a large action set, leading to a significant decline in anti-jamming performance. To address these issues, this paper proposes a multi-agent discrete soft actor-critic (MA-DSAC) algorithm-based collaborative anti-jamming strategy, integrating frequency, power, and modulation-coding domains. This strategy first introduces a Markov game to model and analyze the multi-user anti-jamming problem. Next, the soft actor-critic (SAC) algorithm is discretized to handle the multi-dimensional discrete action space. Finally, through information exchange between communication transceivers and based on a centralized training with decentralized execution (CTDE) framework, it is extended to a multi-agent DRL algorithm to achieve efficient multi-user cooperative anti-jamming. Simulation results show that in various anti-jamming scenarios with both fixed-mode and intelligent jammers, the proposed anti-jamming strategy’s performance improves by more than 25% compared to traditional value-based DRL strategies, including independent deep Q-network (I-DQN) and multi-agent virtual exploration in deep Q-learning (MA-VEDQL). Furthermore, through information exchange between communication transceivers, the instability problem of multi-agent DRL is effectively alleviated, enabling the communication transceivers to balance competition and cooperation. Consequently, its anti-jamming performance improves by more than 6% compared to the independent DSAC (I-DSAC) strategy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Information Forensics and Security 工程技术-工程：电子与电气

CiteScore

14.40

自引率

7.40%

发文量

234

审稿时长

6.5 months

期刊介绍： The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features