Ning Rao;Hua Xu;Zisen Qi;Dan Wang;Xiang Peng;Lei Jiang
{"title":"基于非专家演示辅助元强化学习的FHSS通信自适应干扰决策","authors":"Ning Rao;Hua Xu;Zisen Qi;Dan Wang;Xiang Peng;Lei Jiang","doi":"10.1109/LCOMM.2024.3502423","DOIUrl":null,"url":null,"abstract":"Reinforcement learning (RL)’s powerful optimization capabilities have been extensively applied in the field of wireless communication jamming decision-making. However, the generalization of jamming policies has rarely been explored, and most existing studies rely on task-customized reward functions, which are often intractable to design. To address these issues, we propose a meta RL method for frequency-hopping spread spectrum (FHSS) jamming decision-making, aided by inexpert demonstrations. Firstly, the policy network is meta-trained with multiple diverse tasks to obtain initial network parameters with good generalization. Subsequently, we combine RL and behavioral cloning (BC) to extract useful information from demonstrations, along with learning rate adaptation to achieve efficient policy exploration without the task-customized jamming reward. Simulations confirm that our proposed method not only adapts to unseen jamming tasks with just a few fine-tuning steps under general binary rewards condition, but also achieves higher accumulated jamming rewards and results in lower normalized throughput for users, outperforming state-of-the-art methods.","PeriodicalId":13197,"journal":{"name":"IEEE Communications Letters","volume":"29 1","pages":"105-109"},"PeriodicalIF":3.7000,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive Jamming Decision-Making Against FHSS Communications via Inexpert Demonstrations Assisted Meta Reinforcement Learning\",\"authors\":\"Ning Rao;Hua Xu;Zisen Qi;Dan Wang;Xiang Peng;Lei Jiang\",\"doi\":\"10.1109/LCOMM.2024.3502423\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning (RL)’s powerful optimization capabilities have been extensively applied in the field of wireless communication jamming decision-making. However, the generalization of jamming policies has rarely been explored, and most existing studies rely on task-customized reward functions, which are often intractable to design. To address these issues, we propose a meta RL method for frequency-hopping spread spectrum (FHSS) jamming decision-making, aided by inexpert demonstrations. Firstly, the policy network is meta-trained with multiple diverse tasks to obtain initial network parameters with good generalization. Subsequently, we combine RL and behavioral cloning (BC) to extract useful information from demonstrations, along with learning rate adaptation to achieve efficient policy exploration without the task-customized jamming reward. Simulations confirm that our proposed method not only adapts to unseen jamming tasks with just a few fine-tuning steps under general binary rewards condition, but also achieves higher accumulated jamming rewards and results in lower normalized throughput for users, outperforming state-of-the-art methods.\",\"PeriodicalId\":13197,\"journal\":{\"name\":\"IEEE Communications Letters\",\"volume\":\"29 1\",\"pages\":\"105-109\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-11-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Communications Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10771787/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Communications Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10771787/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
Adaptive Jamming Decision-Making Against FHSS Communications via Inexpert Demonstrations Assisted Meta Reinforcement Learning
Reinforcement learning (RL)’s powerful optimization capabilities have been extensively applied in the field of wireless communication jamming decision-making. However, the generalization of jamming policies has rarely been explored, and most existing studies rely on task-customized reward functions, which are often intractable to design. To address these issues, we propose a meta RL method for frequency-hopping spread spectrum (FHSS) jamming decision-making, aided by inexpert demonstrations. Firstly, the policy network is meta-trained with multiple diverse tasks to obtain initial network parameters with good generalization. Subsequently, we combine RL and behavioral cloning (BC) to extract useful information from demonstrations, along with learning rate adaptation to achieve efficient policy exploration without the task-customized jamming reward. Simulations confirm that our proposed method not only adapts to unseen jamming tasks with just a few fine-tuning steps under general binary rewards condition, but also achieves higher accumulated jamming rewards and results in lower normalized throughput for users, outperforming state-of-the-art methods.
期刊介绍:
The IEEE Communications Letters publishes short papers in a rapid publication cycle on advances in the state-of-the-art of communication over different media and channels including wire, underground, waveguide, optical fiber, and storage channels. Both theoretical contributions (including new techniques, concepts, and analyses) and practical contributions (including system experiments and prototypes, and new applications) are encouraged. This journal focuses on the physical layer and the link layer of communication systems.