{"title":"Interference Coordination for Autonomous HetNets Based on Adversarial Learning","authors":"Mu Yan, Jian Yang","doi":"10.1109/ICCSN52437.2021.9463652","DOIUrl":null,"url":null,"abstract":"This paper proposes an intelligent inter-cell interference coordination (ICIC) scheme for autonomous heterogeneous networks (HetNets), where the SBSs agilely schedule sub-channels to individual users at each Transmit Time Interval (TTI) with aim of mitigating interferences and maximizing long-term throughput by sensing the environment. As only local network states including the Signal to Interference plus Noise Ratio (SINR) can be observed in the autonomous HetNets, the decision-making process of the interference coordination at SBSs is modeled as a non-cooperative partially observable Markov decision process (POMDP) game, with aim to achieve Nash Equilibrium. Since the reward function is inexplicit and only few samples can be used for prior-training, we formulate the ICIC problem as a distributed inverse reinforcement learning (IRL) problem following the POMDP games. Furthermore, we propose a non-prior knowledge based self-imitating learning (SIL) algorithm which incorporates Wasserstein Generative Adversarial Networks (WGANs) and Double Deep Q Network (Double DQN) algorithms for performing behavior imitation and few-shot learning in solving the IRL problem from both the policy and value. In order to cater for the plug-and-play operation mode of indoor SBSs, the Double DQN is initialized according to the SINR, and a nested training scheme is adopted to overcome the slow-start problem of the learning process. Numerical results reveal that SIL is able to implement TTI level’s decision-making to solve the ICIC problem, and the overall network throughput of SIL can be improved by up to 19.8% when compared with other known benchmark algorithms.","PeriodicalId":263568,"journal":{"name":"2021 13th International Conference on Communication Software and Networks (ICCSN)","volume":"2003 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Communication Software and Networks (ICCSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSN52437.2021.9463652","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This paper proposes an intelligent inter-cell interference coordination (ICIC) scheme for autonomous heterogeneous networks (HetNets), where the SBSs agilely schedule sub-channels to individual users at each Transmit Time Interval (TTI) with aim of mitigating interferences and maximizing long-term throughput by sensing the environment. As only local network states including the Signal to Interference plus Noise Ratio (SINR) can be observed in the autonomous HetNets, the decision-making process of the interference coordination at SBSs is modeled as a non-cooperative partially observable Markov decision process (POMDP) game, with aim to achieve Nash Equilibrium. Since the reward function is inexplicit and only few samples can be used for prior-training, we formulate the ICIC problem as a distributed inverse reinforcement learning (IRL) problem following the POMDP games. Furthermore, we propose a non-prior knowledge based self-imitating learning (SIL) algorithm which incorporates Wasserstein Generative Adversarial Networks (WGANs) and Double Deep Q Network (Double DQN) algorithms for performing behavior imitation and few-shot learning in solving the IRL problem from both the policy and value. In order to cater for the plug-and-play operation mode of indoor SBSs, the Double DQN is initialized according to the SINR, and a nested training scheme is adopted to overcome the slow-start problem of the learning process. Numerical results reveal that SIL is able to implement TTI level’s decision-making to solve the ICIC problem, and the overall network throughput of SIL can be improved by up to 19.8% when compared with other known benchmark algorithms.