Fast adaptation of multi-cell NOMA resource allocation via federated meta-reinforcement learning

IF 4.6 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computer Networks Pub Date : 2025-09-10 DOI:10.1016/j.comnet.2025.111701

Giang Minh Nguyen , Derek Kwaku Pobi Asiedu , Ji-Hoon Yun

{"title":"Fast adaptation of multi-cell NOMA resource allocation via federated meta-reinforcement learning","authors":"Giang Minh Nguyen , Derek Kwaku Pobi Asiedu , Ji-Hoon Yun","doi":"10.1016/j.comnet.2025.111701","DOIUrl":null,"url":null,"abstract":"<div><div>Radio resource allocation in multi-cellular systems, particularly with non-orthogonal multiple access (NOMA), must be carefully optimized based on real-time user and network conditions, such as channel responses, user population, and inter-cell interference patterns, which naturally fluctuate over time. Fixed machine learning models for radio resource allocation often fail to adapt to these dynamic conditions, leading to suboptimal resource allocation. Moreover, such models struggle to handle inputs and outputs of varying dimensions, limiting their scalability and generalization in time-varying resource allocation problems. To address these challenges, we propose a novel multi-cell, multi-subband NOMA radio resource allocation solution that integrates meta-learning and federated learning (FL) with multi-agent reinforcement learning (MARL). Our solution maximizes energy efficiency (EE) by enabling one-shot adaptation to environmental variations and dynamically managing information dimensionality through the instantiation and removal of agents from a pretrained model. Under this framework, power allocation (PA) and subband allocation (SA) are jointly optimized in a two-stage process: the first stage employs a central reinforcement learning (RL) agent to solve the PA subproblem, while the second stage leverages multi-agent meta-RL combined with FL to address the SA subproblem. Evaluation results demonstrate that our solution effectively adapts to dynamic environments, including variations in channel conditions due to path loss and Doppler effects, as well as fluctuations in the user set. Notably, our approach consistently outperforms the benchmark algorithms, highlighting its robustness and superior adaptability.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"272 ","pages":"Article 111701"},"PeriodicalIF":4.6000,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S138912862500667X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Radio resource allocation in multi-cellular systems, particularly with non-orthogonal multiple access (NOMA), must be carefully optimized based on real-time user and network conditions, such as channel responses, user population, and inter-cell interference patterns, which naturally fluctuate over time. Fixed machine learning models for radio resource allocation often fail to adapt to these dynamic conditions, leading to suboptimal resource allocation. Moreover, such models struggle to handle inputs and outputs of varying dimensions, limiting their scalability and generalization in time-varying resource allocation problems. To address these challenges, we propose a novel multi-cell, multi-subband NOMA radio resource allocation solution that integrates meta-learning and federated learning (FL) with multi-agent reinforcement learning (MARL). Our solution maximizes energy efficiency (EE) by enabling one-shot adaptation to environmental variations and dynamically managing information dimensionality through the instantiation and removal of agents from a pretrained model. Under this framework, power allocation (PA) and subband allocation (SA) are jointly optimized in a two-stage process: the first stage employs a central reinforcement learning (RL) agent to solve the PA subproblem, while the second stage leverages multi-agent meta-RL combined with FL to address the SA subproblem. Evaluation results demonstrate that our solution effectively adapts to dynamic environments, including variations in channel conditions due to path loss and Doppler effects, as well as fluctuations in the user set. Notably, our approach consistently outperforms the benchmark algorithms, highlighting its robustness and superior adaptability.

查看原文本刊更多论文

基于联合元强化学习的多单元NOMA资源分配快速适应

多蜂窝系统中的无线电资源分配，特别是非正交多址（NOMA），必须根据实时用户和网络条件（如信道响应、用户数量和蜂窝间干扰模式）仔细优化，这些条件随时间自然波动。固定的无线电资源分配机器学习模型往往不能适应这些动态条件，导致资源分配不理想。此外，这些模型难以处理不同维度的输入和输出，限制了它们在时变资源分配问题中的可扩展性和泛化。为了应对这些挑战，我们提出了一种新的多单元、多子带NOMA无线电资源分配解决方案，该解决方案将元学习和联邦学习（FL）与多智能体强化学习（MARL）相结合。我们的解决方案通过实现对环境变化的一次性适应，并通过实例化和从预训练模型中删除代理来动态管理信息维度，从而最大限度地提高了能源效率（EE）。在该框架下，功率分配（PA）和子带分配（SA）分两阶段进行联合优化：第一阶段采用中央强化学习（RL）智能体解决PA子问题，第二阶段采用多智能体元强化学习结合FL解决SA子问题。评估结果表明，我们的解决方案有效地适应动态环境，包括由于路径损耗和多普勒效应导致的信道条件变化，以及用户集的波动。值得注意的是，我们的方法始终优于基准算法，突出了其鲁棒性和优越的适应性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Networks 工程技术-电信学

CiteScore

10.80

自引率

3.60%

发文量

434

审稿时长

8.6 months

期刊介绍： Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.