LRCC: Long-haul RDMA congestion control for cross-datacenter networks

IF 4.6 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Dingyu Yan , Yaping Liu , Shuo Zhang , Mingguang Xu , Zhikai Yang , Binxing Fang
{"title":"LRCC: Long-haul RDMA congestion control for cross-datacenter networks","authors":"Dingyu Yan ,&nbsp;Yaping Liu ,&nbsp;Shuo Zhang ,&nbsp;Mingguang Xu ,&nbsp;Zhikai Yang ,&nbsp;Binxing Fang","doi":"10.1016/j.comnet.2025.111756","DOIUrl":null,"url":null,"abstract":"<div><div>With the widespread deployment of applications such as cloud storage and distributed model training, Remote Direct Memory Access (RDMA) is increasingly applied to cross-datacenter networks. These networks typically consist of multiple regional datacenters interconnected by dedicated long-haul optical fiber and Data Center Interconnect (DCI) switches. However, existing RDMA congestion control mechanisms face significant challenges in cross-datacenter networks. Firstly, the long control loops struggle to effectively suppress line-rate bursts of cross-domain RDMA traffic, leading to persistent queues that degrade overall network performance. Secondly, the heterogeneous Round-Trip Time (RTT) characteristics between cross-domain and intra-datacenter traffic disrupt the convergence and fairness guarantees of conventional methods, further exacerbating cross-domain congestion issues. In this paper, we propose a switch-driven Long-haul RDMA Congestion Control (LRCC). LRCC utilizes near-source switches to generate congestion notification packets, effectively shortening the long control loops. Furthermore, LRCC implements a precise fair-rate computation mechanism on the switches and an adaptive rate-increase strategy on the host. These mechanisms mitigate cross-domain congestion caused by hybrid traffic while ensuring high throughput for long-haul flows. We implemented a prototype system of LRCC on programmable switches and 400Gbps FPGA NICs. Testbed experiments show that, compared with the NVIDIA CX7, LRCC reduces tail latency by 11 %-16 % in short-distance congestion scenarios and by 45 %-49 % in a 640 km long-distance scenario. Large-scale simulations further demonstrate that in the cross-datacenter networks, LRCC outperforms existing solutions, reducing the average Flow Completion Time (FCT) by up to 67.2 %, 94 % and 48.4 %, respectively, compared to DCQCN, HPCC and BiCC.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"273 ","pages":"Article 111756"},"PeriodicalIF":4.6000,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625007224","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

With the widespread deployment of applications such as cloud storage and distributed model training, Remote Direct Memory Access (RDMA) is increasingly applied to cross-datacenter networks. These networks typically consist of multiple regional datacenters interconnected by dedicated long-haul optical fiber and Data Center Interconnect (DCI) switches. However, existing RDMA congestion control mechanisms face significant challenges in cross-datacenter networks. Firstly, the long control loops struggle to effectively suppress line-rate bursts of cross-domain RDMA traffic, leading to persistent queues that degrade overall network performance. Secondly, the heterogeneous Round-Trip Time (RTT) characteristics between cross-domain and intra-datacenter traffic disrupt the convergence and fairness guarantees of conventional methods, further exacerbating cross-domain congestion issues. In this paper, we propose a switch-driven Long-haul RDMA Congestion Control (LRCC). LRCC utilizes near-source switches to generate congestion notification packets, effectively shortening the long control loops. Furthermore, LRCC implements a precise fair-rate computation mechanism on the switches and an adaptive rate-increase strategy on the host. These mechanisms mitigate cross-domain congestion caused by hybrid traffic while ensuring high throughput for long-haul flows. We implemented a prototype system of LRCC on programmable switches and 400Gbps FPGA NICs. Testbed experiments show that, compared with the NVIDIA CX7, LRCC reduces tail latency by 11 %-16 % in short-distance congestion scenarios and by 45 %-49 % in a 640 km long-distance scenario. Large-scale simulations further demonstrate that in the cross-datacenter networks, LRCC outperforms existing solutions, reducing the average Flow Completion Time (FCT) by up to 67.2 %, 94 % and 48.4 %, respectively, compared to DCQCN, HPCC and BiCC.
LRCC:跨数据中心网络的长途RDMA拥塞控制
随着云存储和分布式模型训练等应用的广泛部署,远程直接内存访问(RDMA)越来越多地应用于跨数据中心网络。这些网络通常由多个区域数据中心组成,通过专用的长途光纤和数据中心互连(DCI)交换机相互连接。然而,现有的RDMA拥塞控制机制在跨数据中心网络中面临着重大挑战。首先,长控制回路难以有效地抑制跨域RDMA流量的线速率突发,从而导致持久性队列,从而降低整体网络性能。其次,跨域和数据中心内流量之间的异构往返时间(RTT)特征破坏了传统方法的收敛性和公平性保证,进一步加剧了跨域拥塞问题。在本文中,我们提出了一种交换机驱动的长途RDMA拥塞控制(LRCC)。LRCC利用近源交换机生成拥塞通知报文,有效缩短了较长的控制环路。此外,LRCC在交换机上实现了精确的公平速率计算机制,在主机上实现了自适应速率增加策略。这些机制减轻了混合流量引起的跨域拥塞,同时确保了长途流量的高吞吐量。我们在可编程交换机和400Gbps FPGA网卡上实现了LRCC的原型系统。试验台实验表明,与NVIDIA CX7相比,LRCC在短距离拥塞场景中减少了11% - 16%的尾部延迟,在640 公里的长距离场景中减少了45% - 49%。大规模仿真进一步证明,在跨数据中心网络中,LRCC优于现有的解决方案,与DCQCN、HPCC和BiCC相比,LRCC的平均流量完成时间(FCT)分别减少了67.2%、94%和48.4%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computer Networks
Computer Networks 工程技术-电信学
CiteScore
10.80
自引率
3.60%
发文量
434
审稿时长
8.6 months
期刊介绍: Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信