{"title":"Hierarchical Adaptive Learning-Based Congestion Control With Low Training Overhead for Datacenter Networks","authors":"Jinbin Hu;Zikai Zhou;Jing Wang","doi":"10.1109/TNSM.2025.3589637","DOIUrl":null,"url":null,"abstract":"Most congestion control mechanisms perform well in specific datacenter networks, but none can consistently deliver good performance across varying scenarios. Recently proposed frameworks based on reinforcement learning can flexibly select congestion control algorithms to adapt to dynamic network. However, frequently altering the congestion control mechanisms during relatively stable periods of the network actually leads to instability and unnecessary computational overhead. In this paper, we propose a lightweight and hierarchical adaptive congestion control algorithm (LACC) to be resilient to the varying network. LACC dynamically selects the appropriate congestion control mechanism only when the current congestion control algorithm is not suitable for the current network state, rather than changing the congestion control scheme every training cycle to ensure network stability. The simulation results show that LACC significantly reduces the average overhead by 31% and improves throughput by up to 47%, 35%, 23% and 15% compared to Cubic, Reno, BBR and Antelope, respectively.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 5","pages":"4061-4069"},"PeriodicalIF":5.4000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Network and Service Management","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11082387/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Most congestion control mechanisms perform well in specific datacenter networks, but none can consistently deliver good performance across varying scenarios. Recently proposed frameworks based on reinforcement learning can flexibly select congestion control algorithms to adapt to dynamic network. However, frequently altering the congestion control mechanisms during relatively stable periods of the network actually leads to instability and unnecessary computational overhead. In this paper, we propose a lightweight and hierarchical adaptive congestion control algorithm (LACC) to be resilient to the varying network. LACC dynamically selects the appropriate congestion control mechanism only when the current congestion control algorithm is not suitable for the current network state, rather than changing the congestion control scheme every training cycle to ensure network stability. The simulation results show that LACC significantly reduces the average overhead by 31% and improves throughput by up to 47%, 35%, 23% and 15% compared to Cubic, Reno, BBR and Antelope, respectively.
期刊介绍:
IEEE Transactions on Network and Service Management will publish (online only) peerreviewed archival quality papers that advance the state-of-the-art and practical applications of network and service management. Theoretical research contributions (presenting new concepts and techniques) and applied contributions (reporting on experiences and experiments with actual systems) will be encouraged. These transactions will focus on the key technical issues related to: Management Models, Architectures and Frameworks; Service Provisioning, Reliability and Quality Assurance; Management Functions; Enabling Technologies; Information and Communication Models; Policies; Applications and Case Studies; Emerging Technologies and Standards.