Reinforcement learning for dynamic resource allocation in optical networks: hype or hope?

IF 4.3 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Optical Communications and Networking Pub Date : 2025-06-27 DOI:10.1364/JOCN.559990

Michael Doherty;Robin Matzner;Rasoul Sadeghi;Polina Bayvel;Alejandra Beghelli

{"title":"Reinforcement learning for dynamic resource allocation in optical networks: hype or hope?","authors":"Michael Doherty;Robin Matzner;Rasoul Sadeghi;Polina Bayvel;Alejandra Beghelli","doi":"10.1364/JOCN.559990","DOIUrl":null,"url":null,"abstract":"The application of reinforcement learning (RL) to dynamic resource allocation in optical networks has been the focus of intense research activity in recent years, with almost 100 peer-reviewed papers. We present a review of progress in this field and identify weaknesses in benchmarking practices and reproducibility. To demonstrate best practice, we exactly recreate the problem settings from five landmark papers and apply improved benchmarks. To determine the best benchmarks, we evaluate several heuristic algorithms and optimize the candidate path count and sort criteria for path selection. We apply the improved benchmarks and demonstrate that simple heuristics outperform the published RL solutions, often with an order of magnitude lower blocking probability. Finally, to estimate the limits of improvement on the benchmarks, we present empirical lower bounds on blocking probability using a novel, to our knowledge, defragmentation-based method. Our method estimates that traffic load can be increased by 19%–36% for the same blocking in our examples, which may motivate further research on optimized resource allocation. We make our simulation framework and results openly available to promote reproducible research and standardized evaluation: https://doi.org/10.5281/zenodo.12594495.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 9","pages":"D1-D17"},"PeriodicalIF":4.3000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Optical Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11053539/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

The application of reinforcement learning (RL) to dynamic resource allocation in optical networks has been the focus of intense research activity in recent years, with almost 100 peer-reviewed papers. We present a review of progress in this field and identify weaknesses in benchmarking practices and reproducibility. To demonstrate best practice, we exactly recreate the problem settings from five landmark papers and apply improved benchmarks. To determine the best benchmarks, we evaluate several heuristic algorithms and optimize the candidate path count and sort criteria for path selection. We apply the improved benchmarks and demonstrate that simple heuristics outperform the published RL solutions, often with an order of magnitude lower blocking probability. Finally, to estimate the limits of improvement on the benchmarks, we present empirical lower bounds on blocking probability using a novel, to our knowledge, defragmentation-based method. Our method estimates that traffic load can be increased by 19%–36% for the same blocking in our examples, which may motivate further research on optimized resource allocation. We make our simulation framework and results openly available to promote reproducible research and standardized evaluation: https://doi.org/10.5281/zenodo.12594495.

查看原文本刊更多论文

光网络中动态资源分配的强化学习：炒作还是希望？

强化学习（RL）在光网络动态资源分配中的应用是近年来研究的热点，已有近100篇同行评议论文。我们提出的进展在这一领域的审查，并确定在基准实践和可重复性的弱点。为了演示最佳实践，我们精确地重现了五篇具有里程碑意义的论文中的问题设置，并应用了改进的基准。为了确定最佳基准，我们评估了几种启发式算法，并优化了候选路径计数和路径选择的排序标准。我们应用改进的基准测试，并证明简单的启发式优于已发布的RL解决方案，通常具有较低的阻塞概率数量级。最后，为了估计基准改进的极限，我们使用一种新颖的，据我们所知，基于碎片整理的方法提出了阻塞概率的经验下界。我们的方法估计，在我们的示例中，对于相同的阻塞，流量负载可以增加19%-36%，这可能会激发对优化资源分配的进一步研究。我们公开我们的模拟框架和结果，以促进可重复的研究和标准化评估：https://doi.org/10.5281/zenodo.12594495。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Optical Communications and Networking 工程技术-电信学

CiteScore

9.40

自引率

16.00%

发文量

104

审稿时长

4 months

期刊介绍： The scope of the Journal includes advances in the state-of-the-art of optical networking science, technology, and engineering. Both theoretical contributions (including new techniques, concepts, analyses, and economic studies) and practical contributions (including optical networking experiments, prototypes, and new applications) are encouraged. Subareas of interest include the architecture and design of optical networks, optical network survivability and security, software-defined optical networking, elastic optical networks, data and control plane advances, network management related innovation, and optical access networks. Enabling technologies and their applications are suitable topics only if the results are shown to directly impact optical networking beyond simple point-to-point networks.