{"title":"无线网络中的联合频谱和功率分配:两阶段多代理强化学习法","authors":"Pengcheng Dai;He Wang;Huazhou Hou;Xusheng Qian;Wenwu Yu","doi":"10.1109/TETCI.2024.3360305","DOIUrl":null,"url":null,"abstract":"This paper investigates the application of multi-agent reinforcement learning (MARL) algorithm to solve the joint spectrum and power allocation problem (JSPAP) in wireless network. The objective of JSPAP is to optimize the subband selection and transmit power levels for links, with the aim of maximizing the sum-rate utility function. To address the JSPAP with discrete subband selection and continuous power allocation, most existing algorithms rely on a centralized optimizer and the instantaneous global channel state information, which can be challenging to implement in large wireless networks with time-varying subbands. To conquer such limitation, a two-stage MARL algorithm is proposed, which comprises a top layer network for selecting subbands across all links and a bottom layer network for determining the transmit power levels for all transmitters. By utilizing the value decomposition technique in the top layer network, the links can cooperatively select transmission subbands, effectively resolving non-stationarity issues in wireless network. Additionally, in the bottom layer network of the proposed two-stage MARL algorithm, each transmitter selects the transmit power level based solely on the local information, thereby effectively reducing computational burden. Empirical experiments demonstrate the effectiveness of the proposed two-stage MARL algorithm by comparison with the state-of-the-art RL algorithms and fractional programming algorithms.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":null,"pages":null},"PeriodicalIF":5.3000,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Joint Spectrum and Power Allocation in Wireless Network: A Two-Stage Multi-Agent Reinforcement Learning Method\",\"authors\":\"Pengcheng Dai;He Wang;Huazhou Hou;Xusheng Qian;Wenwu Yu\",\"doi\":\"10.1109/TETCI.2024.3360305\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper investigates the application of multi-agent reinforcement learning (MARL) algorithm to solve the joint spectrum and power allocation problem (JSPAP) in wireless network. The objective of JSPAP is to optimize the subband selection and transmit power levels for links, with the aim of maximizing the sum-rate utility function. To address the JSPAP with discrete subband selection and continuous power allocation, most existing algorithms rely on a centralized optimizer and the instantaneous global channel state information, which can be challenging to implement in large wireless networks with time-varying subbands. To conquer such limitation, a two-stage MARL algorithm is proposed, which comprises a top layer network for selecting subbands across all links and a bottom layer network for determining the transmit power levels for all transmitters. By utilizing the value decomposition technique in the top layer network, the links can cooperatively select transmission subbands, effectively resolving non-stationarity issues in wireless network. Additionally, in the bottom layer network of the proposed two-stage MARL algorithm, each transmitter selects the transmit power level based solely on the local information, thereby effectively reducing computational burden. Empirical experiments demonstrate the effectiveness of the proposed two-stage MARL algorithm by comparison with the state-of-the-art RL algorithms and fractional programming algorithms.\",\"PeriodicalId\":13135,\"journal\":{\"name\":\"IEEE Transactions on Emerging Topics in Computational Intelligence\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2024-02-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Emerging Topics in Computational Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10438524/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10438524/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Joint Spectrum and Power Allocation in Wireless Network: A Two-Stage Multi-Agent Reinforcement Learning Method
This paper investigates the application of multi-agent reinforcement learning (MARL) algorithm to solve the joint spectrum and power allocation problem (JSPAP) in wireless network. The objective of JSPAP is to optimize the subband selection and transmit power levels for links, with the aim of maximizing the sum-rate utility function. To address the JSPAP with discrete subband selection and continuous power allocation, most existing algorithms rely on a centralized optimizer and the instantaneous global channel state information, which can be challenging to implement in large wireless networks with time-varying subbands. To conquer such limitation, a two-stage MARL algorithm is proposed, which comprises a top layer network for selecting subbands across all links and a bottom layer network for determining the transmit power levels for all transmitters. By utilizing the value decomposition technique in the top layer network, the links can cooperatively select transmission subbands, effectively resolving non-stationarity issues in wireless network. Additionally, in the bottom layer network of the proposed two-stage MARL algorithm, each transmitter selects the transmit power level based solely on the local information, thereby effectively reducing computational burden. Empirical experiments demonstrate the effectiveness of the proposed two-stage MARL algorithm by comparison with the state-of-the-art RL algorithms and fractional programming algorithms.
期刊介绍:
The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys.
TETCI is an electronics only publication. TETCI publishes six issues per year.
Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.