Dynamic spectrum anti-jamming with distributed learning and transfer learning

IF 3.1 3区计算机科学 Q2 TELECOMMUNICATIONS

China Communications Pub Date : 2023-12-01 DOI:10.23919/JCC.fa.2022-0626.202312

Xinyu Zhu, Yang Huang, Delong Liu, Qihui Wu, Xiaohu Ge, Yuan Liu

{"title":"Dynamic spectrum anti-jamming with distributed learning and transfer learning","authors":"Xinyu Zhu, Yang Huang, Delong Liu, Qihui Wu, Xiaohu Ge, Yuan Liu","doi":"10.23919/JCC.fa.2022-0626.202312","DOIUrl":null,"url":null,"abstract":"Physical-layer security issues in wireless systems have attracted great attention. In this paper, we investigate the spectrum anti-jamming (AJ) problem for data transmissions between devices. Considering fast-changing physical-layer jamming attacks in the time/frequency domain, frequency resources have to be configured for devices in advance with unknown jamming patterns (i.e. the time-frequency distribution of the jamming signals) to avoid jamming signals emitted by malicious devices. This process can be formulated as a Markov decision process and solved by reinforcement learning (RL). Unfortunately, state-of-the-art RL methods may put pressure on the system which has limited computing resources. As a result, we propose a novel RL, by integrating the asynchronous advantage actor-critic (A3C) approach with the kernel method to learn a flexible frequency pre-configuration policy. Moreover, in the presence of time-varying jamming patterns, the traditional AJ strategy can not adapt to the dynamic interference strategy. To handle this issue, we design a kernel-based feature transfer learning method to adjust the structure of the policy function online. Simulation results reveal that our proposed approach can significantly outperform various baselines, in terms of the average normalized throughput and the convergence speed of policy learning.","PeriodicalId":9814,"journal":{"name":"China Communications","volume":"79 ","pages":"52-65"},"PeriodicalIF":3.1000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"China Communications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.23919/JCC.fa.2022-0626.202312","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Physical-layer security issues in wireless systems have attracted great attention. In this paper, we investigate the spectrum anti-jamming (AJ) problem for data transmissions between devices. Considering fast-changing physical-layer jamming attacks in the time/frequency domain, frequency resources have to be configured for devices in advance with unknown jamming patterns (i.e. the time-frequency distribution of the jamming signals) to avoid jamming signals emitted by malicious devices. This process can be formulated as a Markov decision process and solved by reinforcement learning (RL). Unfortunately, state-of-the-art RL methods may put pressure on the system which has limited computing resources. As a result, we propose a novel RL, by integrating the asynchronous advantage actor-critic (A3C) approach with the kernel method to learn a flexible frequency pre-configuration policy. Moreover, in the presence of time-varying jamming patterns, the traditional AJ strategy can not adapt to the dynamic interference strategy. To handle this issue, we design a kernel-based feature transfer learning method to adjust the structure of the policy function online. Simulation results reveal that our proposed approach can significantly outperform various baselines, in terms of the average normalized throughput and the convergence speed of policy learning.

查看原文本刊更多论文

利用分布式学习和迁移学习进行动态频谱抗干扰

无线系统中的物理层安全问题已引起人们的极大关注。本文研究了设备间数据传输的频谱抗干扰（AJ）问题。考虑到时域/频域中快速变化的物理层干扰攻击，在未知干扰模式（即干扰信号的时频分布）的情况下，必须提前为设备配置频率资源，以避免恶意设备发射干扰信号。这一过程可表述为马尔可夫决策过程，并通过强化学习（RL）加以解决。遗憾的是，最先进的 RL 方法可能会对计算资源有限的系统造成压力。因此，我们提出了一种新颖的强化学习方法，将异步优势行动者批判（A3C）方法与内核方法相结合，学习灵活的频率预配置策略。此外，在存在时变干扰模式的情况下，传统的 AJ 策略无法适应动态干扰策略。为了解决这个问题，我们设计了一种基于内核的特征转移学习方法来在线调整策略函数的结构。仿真结果表明，我们提出的方法在平均归一化吞吐量和策略学习收敛速度方面都明显优于各种基线方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

China Communications 工程技术-电信学

CiteScore

8.00

自引率

12.20%

发文量

2868

审稿时长

8.6 months

期刊介绍： China Communications (ISSN 1673-5447) is an English-language monthly journal cosponsored by the China Institute of Communications (CIC) and IEEE Communications Society (IEEE ComSoc). It is aimed at readers in industry, universities, research and development organizations, and government agencies in the field of Information and Communications Technologies (ICTs) worldwide. The journal's main objective is to promote academic exchange in the ICTs sector and publish high-quality papers to contribute to the global ICTs industry. It provides instant access to the latest articles and papers, presenting leading-edge research achievements, tutorial overviews, and descriptions of significant practical applications of technology. China Communications has been indexed in SCIE (Science Citation Index-Expanded) since January 2007. Additionally, all articles have been available in the IEEE Xplore digital library since January 2013.