基于分布式强化学习的功率控制算法研究

Journal of software engineering and applications Pub Date : 2023-01-01 DOI:10.12677/sea.2023.123052

轲司

{"title":"基于分布式强化学习的功率控制算法研究","authors":"轲司","doi":"10.12677/sea.2023.123052","DOIUrl":null,"url":null,"abstract":"Reinforcement learning is applied as a model free control method to solve the problem of co channel interference in cellular networks. However, in value based reinforcement learning algorithms, error in function approximation leads to overestimation of the Q value, which leads to the algorithm converging to a suboptimal strategy and poor performance in suppressing channel interference, and the convergence speed is slow in high-frequency scenarios. This paper proposes a control method suitable for distributed deployment, which uses DDQN to learn discrete strategies, and adds a delay-depth deterministic strategy gradient algorithm with a triplet criticism mechan-司轲，李烨","PeriodicalId":73949,"journal":{"name":"Journal of software engineering and applications","volume":"127 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on Power Control Algorithm Based on Distributed Reinforcement Learning\",\"authors\":\"轲司\",\"doi\":\"10.12677/sea.2023.123052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning is applied as a model free control method to solve the problem of co channel interference in cellular networks. However, in value based reinforcement learning algorithms, error in function approximation leads to overestimation of the Q value, which leads to the algorithm converging to a suboptimal strategy and poor performance in suppressing channel interference, and the convergence speed is slow in high-frequency scenarios. This paper proposes a control method suitable for distributed deployment, which uses DDQN to learn discrete strategies, and adds a delay-depth deterministic strategy gradient algorithm with a triplet criticism mechan-司轲，李烨\",\"PeriodicalId\":73949,\"journal\":{\"name\":\"Journal of software engineering and applications\",\"volume\":\"127 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of software engineering and applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.12677/sea.2023.123052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of software engineering and applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12677/sea.2023.123052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Research on Power Control Algorithm Based on Distributed Reinforcement Learning

Reinforcement learning is applied as a model free control method to solve the problem of co channel interference in cellular networks. However, in value based reinforcement learning algorithms, error in function approximation leads to overestimation of the Q value, which leads to the algorithm converging to a suboptimal strategy and poor performance in suppressing channel interference, and the convergence speed is slow in high-frequency scenarios. This paper proposes a control method suitable for distributed deployment, which uses DDQN to learn discrete strategies, and adds a delay-depth deterministic strategy gradient algorithm with a triplet criticism mechan-司轲，李烨

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of software engineering and applications

自引率

0.00%

发文量