基于深度强化学习的分段路由自适应交通工程

IF 4.6 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computer Networks Pub Date : 2025-05-21 DOI:10.1016/j.comnet.2025.111356

Ying Tian , Zhiliang Wang , Xia Yin , Xingang Shi , Jiahai Yang , Han Zhang

{"title":"基于深度强化学习的分段路由自适应交通工程","authors":"Ying Tian , Zhiliang Wang , Xia Yin , Xingang Shi , Jiahai Yang , Han Zhang","doi":"10.1016/j.comnet.2025.111356","DOIUrl":null,"url":null,"abstract":"<div><div>Segment Routing (SR) is a source routing technique that has been widely used in Traffic Engineering (TE) because of its scalability and flexibility. Despite extensive research on Traffic Engineering with Segment Routing (SR-TE) in recent years, online SR-TE still encounters challenges such as the absence of real-time traffic matrices (TMs), slow online decision speed, and unsatisfactory TE performance. Although TE with Reinforcement Learning (RL) may obviate the need for real-time TMs in online TE, existing studies struggle to handle the vast number of candidate routing plans introduced by SR-TE, as well as have significant training overhead. In this paper, we propose an online adaptive SR-TE algorithm named Adpt-SRTE. With the help of deep reinforcement learning (DRL), Adpt-SRTE is first trained with pre-collected historical TMs, and then provides SR routing configuration for new TMs online when only real-time link utilization is known. To deal with the massive number of candidate routing plans, Adpt-SRTE strategically combines the Proximal Policy Optimization (PPO) algorithm with action branching architecture. Besides, appropriate training methods are used to improve TE performance and reduce training overhead. Experimental results demonstrate that Adpt-SRTE can achieve good TE performance for both short and long time scale up to weeks, reducing the maximum link utilization by up to 33%. Besides, it has low offline training overhead, short online decision time and low path configuration overhead.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"267 ","pages":"Article 111356"},"PeriodicalIF":4.6000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive traffic engineering with segment routing through deep reinforcement learning\",\"authors\":\"Ying Tian , Zhiliang Wang , Xia Yin , Xingang Shi , Jiahai Yang , Han Zhang\",\"doi\":\"10.1016/j.comnet.2025.111356\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Segment Routing (SR) is a source routing technique that has been widely used in Traffic Engineering (TE) because of its scalability and flexibility. Despite extensive research on Traffic Engineering with Segment Routing (SR-TE) in recent years, online SR-TE still encounters challenges such as the absence of real-time traffic matrices (TMs), slow online decision speed, and unsatisfactory TE performance. Although TE with Reinforcement Learning (RL) may obviate the need for real-time TMs in online TE, existing studies struggle to handle the vast number of candidate routing plans introduced by SR-TE, as well as have significant training overhead. In this paper, we propose an online adaptive SR-TE algorithm named Adpt-SRTE. With the help of deep reinforcement learning (DRL), Adpt-SRTE is first trained with pre-collected historical TMs, and then provides SR routing configuration for new TMs online when only real-time link utilization is known. To deal with the massive number of candidate routing plans, Adpt-SRTE strategically combines the Proximal Policy Optimization (PPO) algorithm with action branching architecture. Besides, appropriate training methods are used to improve TE performance and reduce training overhead. Experimental results demonstrate that Adpt-SRTE can achieve good TE performance for both short and long time scale up to weeks, reducing the maximum link utilization by up to 33%. Besides, it has low offline training overhead, short online decision time and low path configuration overhead.</div></div>\",\"PeriodicalId\":50637,\"journal\":{\"name\":\"Computer Networks\",\"volume\":\"267 \",\"pages\":\"Article 111356\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1389128625003238\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625003238","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

段路由（SR）是一种具有可扩展性和灵活性的源路由技术，在流量工程中得到了广泛的应用。尽管近年来对分段路由流量工程（SR-TE）进行了广泛的研究，但在线SR-TE仍然面临缺乏实时流量矩阵（TMs）、在线决策速度慢、TE性能不理想等挑战。尽管带有强化学习（RL）的TE可能会消除在线TE中对实时TMs的需求，但现有的研究难以处理SR-TE引入的大量候选路由计划，并且具有显着的训练开销。本文提出了一种在线自适应SR-TE算法，命名为adapt - srte。在深度强化学习（DRL）的帮助下，Adpt-SRTE首先使用预先收集的历史tm进行训练，然后在只知道实时链路利用率的情况下为在线的新tm提供SR路由配置。为了处理大量的候选路由计划，adaptive - srte策略地将邻近策略优化（PPO）算法与动作分支架构相结合。此外，采用合适的训练方法提高TE性能，减少训练开销。实验结果表明，adapt - srte可以在长达数周的短时间和长时间尺度上实现良好的TE性能，最大链路利用率降低高达33%。此外，它具有低的离线训练开销、短的在线决策时间和低的路径配置开销。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Adaptive traffic engineering with segment routing through deep reinforcement learning

Segment Routing (SR) is a source routing technique that has been widely used in Traffic Engineering (TE) because of its scalability and flexibility. Despite extensive research on Traffic Engineering with Segment Routing (SR-TE) in recent years, online SR-TE still encounters challenges such as the absence of real-time traffic matrices (TMs), slow online decision speed, and unsatisfactory TE performance. Although TE with Reinforcement Learning (RL) may obviate the need for real-time TMs in online TE, existing studies struggle to handle the vast number of candidate routing plans introduced by SR-TE, as well as have significant training overhead. In this paper, we propose an online adaptive SR-TE algorithm named Adpt-SRTE. With the help of deep reinforcement learning (DRL), Adpt-SRTE is first trained with pre-collected historical TMs, and then provides SR routing configuration for new TMs online when only real-time link utilization is known. To deal with the massive number of candidate routing plans, Adpt-SRTE strategically combines the Proximal Policy Optimization (PPO) algorithm with action branching architecture. Besides, appropriate training methods are used to improve TE performance and reduce training overhead. Experimental results demonstrate that Adpt-SRTE can achieve good TE performance for both short and long time scale up to weeks, reducing the maximum link utilization by up to 33%. Besides, it has low offline training overhead, short online decision time and low path configuration overhead.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Networks 工程技术-电信学

CiteScore

10.80

自引率

3.60%

发文量

434

审稿时长

8.6 months

期刊介绍： Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.