{"title":"基于对抗学习的自主HetNets干扰协调","authors":"Mu Yan, Jian Yang","doi":"10.1109/ICCSN52437.2021.9463652","DOIUrl":null,"url":null,"abstract":"This paper proposes an intelligent inter-cell interference coordination (ICIC) scheme for autonomous heterogeneous networks (HetNets), where the SBSs agilely schedule sub-channels to individual users at each Transmit Time Interval (TTI) with aim of mitigating interferences and maximizing long-term throughput by sensing the environment. As only local network states including the Signal to Interference plus Noise Ratio (SINR) can be observed in the autonomous HetNets, the decision-making process of the interference coordination at SBSs is modeled as a non-cooperative partially observable Markov decision process (POMDP) game, with aim to achieve Nash Equilibrium. Since the reward function is inexplicit and only few samples can be used for prior-training, we formulate the ICIC problem as a distributed inverse reinforcement learning (IRL) problem following the POMDP games. Furthermore, we propose a non-prior knowledge based self-imitating learning (SIL) algorithm which incorporates Wasserstein Generative Adversarial Networks (WGANs) and Double Deep Q Network (Double DQN) algorithms for performing behavior imitation and few-shot learning in solving the IRL problem from both the policy and value. In order to cater for the plug-and-play operation mode of indoor SBSs, the Double DQN is initialized according to the SINR, and a nested training scheme is adopted to overcome the slow-start problem of the learning process. Numerical results reveal that SIL is able to implement TTI level’s decision-making to solve the ICIC problem, and the overall network throughput of SIL can be improved by up to 19.8% when compared with other known benchmark algorithms.","PeriodicalId":263568,"journal":{"name":"2021 13th International Conference on Communication Software and Networks (ICCSN)","volume":"2003 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Interference Coordination for Autonomous HetNets Based on Adversarial Learning\",\"authors\":\"Mu Yan, Jian Yang\",\"doi\":\"10.1109/ICCSN52437.2021.9463652\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes an intelligent inter-cell interference coordination (ICIC) scheme for autonomous heterogeneous networks (HetNets), where the SBSs agilely schedule sub-channels to individual users at each Transmit Time Interval (TTI) with aim of mitigating interferences and maximizing long-term throughput by sensing the environment. As only local network states including the Signal to Interference plus Noise Ratio (SINR) can be observed in the autonomous HetNets, the decision-making process of the interference coordination at SBSs is modeled as a non-cooperative partially observable Markov decision process (POMDP) game, with aim to achieve Nash Equilibrium. Since the reward function is inexplicit and only few samples can be used for prior-training, we formulate the ICIC problem as a distributed inverse reinforcement learning (IRL) problem following the POMDP games. Furthermore, we propose a non-prior knowledge based self-imitating learning (SIL) algorithm which incorporates Wasserstein Generative Adversarial Networks (WGANs) and Double Deep Q Network (Double DQN) algorithms for performing behavior imitation and few-shot learning in solving the IRL problem from both the policy and value. In order to cater for the plug-and-play operation mode of indoor SBSs, the Double DQN is initialized according to the SINR, and a nested training scheme is adopted to overcome the slow-start problem of the learning process. Numerical results reveal that SIL is able to implement TTI level’s decision-making to solve the ICIC problem, and the overall network throughput of SIL can be improved by up to 19.8% when compared with other known benchmark algorithms.\",\"PeriodicalId\":263568,\"journal\":{\"name\":\"2021 13th International Conference on Communication Software and Networks (ICCSN)\",\"volume\":\"2003 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 13th International Conference on Communication Software and Networks (ICCSN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCSN52437.2021.9463652\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Communication Software and Networks (ICCSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSN52437.2021.9463652","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
本文提出了一种用于自治异构网络(HetNets)的智能小区间干扰协调(ICIC)方案,其中SBSs在每个传输时间间隔(TTI)为单个用户灵活地调度子信道,目的是通过感知环境来减轻干扰并最大化长期吞吐量。由于自治HetNets只能观察到包括信噪比(SINR)在内的局部网络状态,因此将自治HetNets的干扰协调决策过程建模为非合作的部分可观察马尔可夫决策过程(POMDP)博弈,以达到纳什均衡。由于奖励函数是不明确的,只有少数样本可以用于先验训练,我们将ICIC问题表述为POMDP游戏之后的分布式逆强化学习(IRL)问题。此外,我们提出了一种基于非先验知识的自模仿学习(SIL)算法,该算法结合了Wasserstein生成对抗网络(WGANs)和双深度Q网络(Double Deep Q Network, Double DQN)算法,从策略和值两个方面对IRL问题进行了行为模仿和少镜头学习。为了适应室内SBSs即插即用的运行模式,根据SINR初始化Double DQN,并采用嵌套训练方案克服学习过程启动慢的问题。数值结果表明,SIL能够实现TTI级别的决策来解决ICIC问题,与其他已知的基准算法相比,SIL的整体网络吞吐量可提高19.8%。
Interference Coordination for Autonomous HetNets Based on Adversarial Learning
This paper proposes an intelligent inter-cell interference coordination (ICIC) scheme for autonomous heterogeneous networks (HetNets), where the SBSs agilely schedule sub-channels to individual users at each Transmit Time Interval (TTI) with aim of mitigating interferences and maximizing long-term throughput by sensing the environment. As only local network states including the Signal to Interference plus Noise Ratio (SINR) can be observed in the autonomous HetNets, the decision-making process of the interference coordination at SBSs is modeled as a non-cooperative partially observable Markov decision process (POMDP) game, with aim to achieve Nash Equilibrium. Since the reward function is inexplicit and only few samples can be used for prior-training, we formulate the ICIC problem as a distributed inverse reinforcement learning (IRL) problem following the POMDP games. Furthermore, we propose a non-prior knowledge based self-imitating learning (SIL) algorithm which incorporates Wasserstein Generative Adversarial Networks (WGANs) and Double Deep Q Network (Double DQN) algorithms for performing behavior imitation and few-shot learning in solving the IRL problem from both the policy and value. In order to cater for the plug-and-play operation mode of indoor SBSs, the Double DQN is initialized according to the SINR, and a nested training scheme is adopted to overcome the slow-start problem of the learning process. Numerical results reveal that SIL is able to implement TTI level’s decision-making to solve the ICIC problem, and the overall network throughput of SIL can be improved by up to 19.8% when compared with other known benchmark algorithms.