{"title":"A GAN enhanced meta-deep reinforcement learning approach for DCN routing optimization","authors":"Qing Guo , Wei Zhao , Zhuoheng Lyu , Tingting Zhao","doi":"10.1016/j.inffus.2025.103160","DOIUrl":null,"url":null,"abstract":"<div><div>In large-scale data center networks (DCN), dynamic changes in traffic and topology lead to traffic patterns following a long-tail distribution (meaning a small number of traffic samples appear frequently, while most appear infrequently). This results in clear Non-IID (non-independent and identically distributed) characteristics, posing a serious challenge for traffic routing optimization. Traditional deep reinforcement learning (DRL) methods often assume relatively stable data distributions, while approaches that combine DRL with meta-learning assume only small shifts in traffic or the environment. However, such methods struggle to cope with the scarcity of samples and cross-scenario feature-space shifts caused by long-tail distributions. To address this issue, this paper proposes a global intelligent routing optimization scheme based on meta-reinforcement learning (Meta-DRL) enhanced by generative adversarial networks (GAN). First, a two-level nested Meta-DRL model is built. The lower level focuses on specific task policy optimization, while the upper level learns generalized global network parameters, improving the model’s initialization quality and generalization in unfamiliar network settings. Next, we introduce an innovative mechanism that combines GAN-based feature encoding with a meta-learning discriminator, refining the GAN’s feature discrimination boundary and greatly enhancing the quality of synthesized samples, especially where data is scarce. Furthermore, a parameter feature space mapping mechanism is proposed to unify features from new and old tasks into a shared representation space, avoiding repeated Meta-DRL training when the network environment changes. This substantially boosts the model’s generalization and decision-making efficiency. Simulation results show that, in terms of convergence speed, accumulated rewards, and network routing performance, the GAN-enhanced Meta-DRL method significantly outperforms traditional DRL approaches and other meta-learning methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"121 ","pages":"Article 103160"},"PeriodicalIF":14.7000,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525002337","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In large-scale data center networks (DCN), dynamic changes in traffic and topology lead to traffic patterns following a long-tail distribution (meaning a small number of traffic samples appear frequently, while most appear infrequently). This results in clear Non-IID (non-independent and identically distributed) characteristics, posing a serious challenge for traffic routing optimization. Traditional deep reinforcement learning (DRL) methods often assume relatively stable data distributions, while approaches that combine DRL with meta-learning assume only small shifts in traffic or the environment. However, such methods struggle to cope with the scarcity of samples and cross-scenario feature-space shifts caused by long-tail distributions. To address this issue, this paper proposes a global intelligent routing optimization scheme based on meta-reinforcement learning (Meta-DRL) enhanced by generative adversarial networks (GAN). First, a two-level nested Meta-DRL model is built. The lower level focuses on specific task policy optimization, while the upper level learns generalized global network parameters, improving the model’s initialization quality and generalization in unfamiliar network settings. Next, we introduce an innovative mechanism that combines GAN-based feature encoding with a meta-learning discriminator, refining the GAN’s feature discrimination boundary and greatly enhancing the quality of synthesized samples, especially where data is scarce. Furthermore, a parameter feature space mapping mechanism is proposed to unify features from new and old tasks into a shared representation space, avoiding repeated Meta-DRL training when the network environment changes. This substantially boosts the model’s generalization and decision-making efficiency. Simulation results show that, in terms of convergence speed, accumulated rewards, and network routing performance, the GAN-enhanced Meta-DRL method significantly outperforms traditional DRL approaches and other meta-learning methods.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.