{"title":"Reinforcement-Learning-Based Trajectory Design and Phase-Shift Control in UAV-Mounted-RIS Communications","authors":"Tianjiao Sun;Sixing Yin;Li Deng;F. Richard Yu","doi":"10.1109/TMLCN.2024.3502576","DOIUrl":null,"url":null,"abstract":"Taking advantages of both unmanned aerial vehicles (UAVs) and reconfigurable intelligent surfaces (RISs), UAV-mounted-RIS systems are expected to enhance transmission performance in complicated wireless environments. In this paper, we focus on system design for a UAV-mounted-RIS system and investigate joint optimization for the RIS’s phase shift and the UAV’s trajectory. To cope with the practical issue of inaccessible information on the user terminals’ (UTs) location and channel state, a reinforcement learning (RL)-based solution is proposed to find the optimal policy with finite steps of “trial-and-error”. As the action space is continuous, the deep deterministic policy gradient (DDPG) algorithm is applied to train the RL model. However, the online interaction between the agent and environment may lead to instability during the training and the assumption of (first-order) Markovian state transition could be impractical in real-world problems. Therefore, the decision transformer (DT) algorithm is employed as an alternative for RL model training to adapt to more general situations of state transition. Experimental results demonstrate that the proposed RL solutions are highly efficient in model training along with acceptable performance close to the benchmark, which relies on conventional optimization algorithms with the UT’s locations and channel parameters explicitly known beforehand.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"163-175"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10758222","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Machine Learning in Communications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10758222/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Taking advantages of both unmanned aerial vehicles (UAVs) and reconfigurable intelligent surfaces (RISs), UAV-mounted-RIS systems are expected to enhance transmission performance in complicated wireless environments. In this paper, we focus on system design for a UAV-mounted-RIS system and investigate joint optimization for the RIS’s phase shift and the UAV’s trajectory. To cope with the practical issue of inaccessible information on the user terminals’ (UTs) location and channel state, a reinforcement learning (RL)-based solution is proposed to find the optimal policy with finite steps of “trial-and-error”. As the action space is continuous, the deep deterministic policy gradient (DDPG) algorithm is applied to train the RL model. However, the online interaction between the agent and environment may lead to instability during the training and the assumption of (first-order) Markovian state transition could be impractical in real-world problems. Therefore, the decision transformer (DT) algorithm is employed as an alternative for RL model training to adapt to more general situations of state transition. Experimental results demonstrate that the proposed RL solutions are highly efficient in model training along with acceptable performance close to the benchmark, which relies on conventional optimization algorithms with the UT’s locations and channel parameters explicitly known beforehand.