{"title":"基于深度强化学习的低地球轨道卫星通信上行链路多维资源分配策略","authors":"Yu Hu, Feipeng Qiu, Fei Zheng, Jilong Zhao","doi":"10.1186/s13677-024-00621-z","DOIUrl":null,"url":null,"abstract":"In the LEO satellite communication system, the resource utilization rate is very low due to the constrained resources on satellites and the non-uniform distribution of traffics. In addition, the rapid movement of LEO satellites leads to complicated and changeable networks, which makes it difficult for traditional resource allocation strategies to improve the resource utilization rate. To solve the above problem, this paper proposes a resource allocation strategy based on deep reinforcement learning. The strategy takes the weighted sum of spectral efficiency, energy efficiency and blocking rate as the optimization objective, and constructs a joint power and channel allocation model. The strategy allocates channels and power according to the number of channels, the number of users and the type of business. In the reward decision mechanism, the maximum reward is obtained by maximizing the increment of the optimization target. However, during the optimization process, the decision always focuses on the optimal allocation for current users, and ignores QoS for new users. To avoid the situation, current service beams are integrated with high- traffic beams, and states of beams are refactored to maximize long-term benefits to improve system performance. Simulation experiments show that in scenarios with a high number of users, the proposed resource allocation strategy reduces the blocking rate by at least 5% compared to reinforcement learning methods, effectively enhancing resource utilization.","PeriodicalId":501257,"journal":{"name":"Journal of Cloud Computing","volume":"38 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-dimensional resource allocation strategy for LEO satellite communication uplinks based on deep reinforcement learning\",\"authors\":\"Yu Hu, Feipeng Qiu, Fei Zheng, Jilong Zhao\",\"doi\":\"10.1186/s13677-024-00621-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the LEO satellite communication system, the resource utilization rate is very low due to the constrained resources on satellites and the non-uniform distribution of traffics. In addition, the rapid movement of LEO satellites leads to complicated and changeable networks, which makes it difficult for traditional resource allocation strategies to improve the resource utilization rate. To solve the above problem, this paper proposes a resource allocation strategy based on deep reinforcement learning. The strategy takes the weighted sum of spectral efficiency, energy efficiency and blocking rate as the optimization objective, and constructs a joint power and channel allocation model. The strategy allocates channels and power according to the number of channels, the number of users and the type of business. In the reward decision mechanism, the maximum reward is obtained by maximizing the increment of the optimization target. However, during the optimization process, the decision always focuses on the optimal allocation for current users, and ignores QoS for new users. To avoid the situation, current service beams are integrated with high- traffic beams, and states of beams are refactored to maximize long-term benefits to improve system performance. Simulation experiments show that in scenarios with a high number of users, the proposed resource allocation strategy reduces the blocking rate by at least 5% compared to reinforcement learning methods, effectively enhancing resource utilization.\",\"PeriodicalId\":501257,\"journal\":{\"name\":\"Journal of Cloud Computing\",\"volume\":\"38 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Cloud Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s13677-024-00621-z\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13677-024-00621-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-dimensional resource allocation strategy for LEO satellite communication uplinks based on deep reinforcement learning
In the LEO satellite communication system, the resource utilization rate is very low due to the constrained resources on satellites and the non-uniform distribution of traffics. In addition, the rapid movement of LEO satellites leads to complicated and changeable networks, which makes it difficult for traditional resource allocation strategies to improve the resource utilization rate. To solve the above problem, this paper proposes a resource allocation strategy based on deep reinforcement learning. The strategy takes the weighted sum of spectral efficiency, energy efficiency and blocking rate as the optimization objective, and constructs a joint power and channel allocation model. The strategy allocates channels and power according to the number of channels, the number of users and the type of business. In the reward decision mechanism, the maximum reward is obtained by maximizing the increment of the optimization target. However, during the optimization process, the decision always focuses on the optimal allocation for current users, and ignores QoS for new users. To avoid the situation, current service beams are integrated with high- traffic beams, and states of beams are refactored to maximize long-term benefits to improve system performance. Simulation experiments show that in scenarios with a high number of users, the proposed resource allocation strategy reduces the blocking rate by at least 5% compared to reinforcement learning methods, effectively enhancing resource utilization.