{"title":"基于深度强化学习的TDD多用户MIMO系统智能反射面优化","authors":"Fengyu Zhao;Wen Chen;Ziwei Liu;Jun Li;Qingqing Wu","doi":"10.1109/LWC.2023.3301496","DOIUrl":null,"url":null,"abstract":"In this letter, we investigate the discrete phase shift design of the intelligent reflecting surface (IRS) in a time-division duplexing (TDD) multi-user multiple-input-multiple-output (MIMO) system. We modify the design of deep reinforcement learning (DRL) scheme so that we can maximizing the average downlink data transmission rate free from the sub-channel channel state information (CSI). Based on the characteristics of the model, we modify the “proximal policy optimization (PPO)” algorithm and integrate gated recurrent unit (GRU) to tackle the non-convex optimization problem. Simulation results show that the performance of the proposed PPO-GRU surpasses the benchmarks in terms of performance, convergence speed, and training stability.","PeriodicalId":13343,"journal":{"name":"IEEE Wireless Communications Letters","volume":"12 11","pages":"1951-1955"},"PeriodicalIF":4.6000,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Reinforcement Learning-Based Intelligent Reflecting Surface Optimization for TDD Multi-User MIMO Systems\",\"authors\":\"Fengyu Zhao;Wen Chen;Ziwei Liu;Jun Li;Qingqing Wu\",\"doi\":\"10.1109/LWC.2023.3301496\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this letter, we investigate the discrete phase shift design of the intelligent reflecting surface (IRS) in a time-division duplexing (TDD) multi-user multiple-input-multiple-output (MIMO) system. We modify the design of deep reinforcement learning (DRL) scheme so that we can maximizing the average downlink data transmission rate free from the sub-channel channel state information (CSI). Based on the characteristics of the model, we modify the “proximal policy optimization (PPO)” algorithm and integrate gated recurrent unit (GRU) to tackle the non-convex optimization problem. Simulation results show that the performance of the proposed PPO-GRU surpasses the benchmarks in terms of performance, convergence speed, and training stability.\",\"PeriodicalId\":13343,\"journal\":{\"name\":\"IEEE Wireless Communications Letters\",\"volume\":\"12 11\",\"pages\":\"1951-1955\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2023-08-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Wireless Communications Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10206034/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Wireless Communications Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10206034/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Deep Reinforcement Learning-Based Intelligent Reflecting Surface Optimization for TDD Multi-User MIMO Systems
In this letter, we investigate the discrete phase shift design of the intelligent reflecting surface (IRS) in a time-division duplexing (TDD) multi-user multiple-input-multiple-output (MIMO) system. We modify the design of deep reinforcement learning (DRL) scheme so that we can maximizing the average downlink data transmission rate free from the sub-channel channel state information (CSI). Based on the characteristics of the model, we modify the “proximal policy optimization (PPO)” algorithm and integrate gated recurrent unit (GRU) to tackle the non-convex optimization problem. Simulation results show that the performance of the proposed PPO-GRU surpasses the benchmarks in terms of performance, convergence speed, and training stability.
期刊介绍:
IEEE Wireless Communications Letters publishes short papers in a rapid publication cycle on advances in the state-of-the-art of wireless communications. Both theoretical contributions (including new techniques, concepts, and analyses) and practical contributions (including system experiments and prototypes, and new applications) are encouraged. This journal focuses on the physical layer and the link layer of wireless communication systems.