CCCR: Combining CNP and RTT for congestion control in datacenter networks

IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Haopeng Li , Dingyu Yan , Yaping Liu , Shuo Zhang
{"title":"CCCR: Combining CNP and RTT for congestion control in datacenter networks","authors":"Haopeng Li ,&nbsp;Dingyu Yan ,&nbsp;Yaping Liu ,&nbsp;Shuo Zhang","doi":"10.1016/j.simpat.2025.103189","DOIUrl":null,"url":null,"abstract":"<div><div>With the rapid development of cloud computing, AI, and big data, data center networks face challenges in achieving ultra-low latency, high bandwidth, and stability. Many data centers still rely on traditional switches, which lack programmable features for advanced congestion control algorithms. In this environment, existing algorithms like DCQCN and TIMELY face two major challenges: (1) a single congestion signal (such as ECN or RTT) struggles to accurately reflect network conditions, leading to delayed congestion detection; (2) heuristic rate control strategies are prone to causing network fluctuations and slow convergence, making it difficult to meet the demands of high-bandwidth links. To address these issues, we propose CCCR, a congestion control algorithm that combines ECN (via CNP) and RTT signals. CCCR enables rapid, accurate rate reduction using receiver-side feedback and employs a adaptive rate increase based on minimum, average, and target RTT. It also adjusts in-flight data using per-flow BDP estimation. Simulations show that compared to DCQCN, TIMELY, and Swift, CCCR reduces the average flow completion time by 11%, 20%, and 12% respectively in incast scenarios, with better fairness than HPCC, and achieves up to 82% reduction in tail flow completion time for medium flows and up to 74% for long flows. In large-scale simulations, CCCR achieves comparable performance to programmable switch-based HPCC algorithms.</div></div>","PeriodicalId":49518,"journal":{"name":"Simulation Modelling Practice and Theory","volume":"144 ","pages":"Article 103189"},"PeriodicalIF":3.5000,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Simulation Modelling Practice and Theory","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1569190X25001248","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

With the rapid development of cloud computing, AI, and big data, data center networks face challenges in achieving ultra-low latency, high bandwidth, and stability. Many data centers still rely on traditional switches, which lack programmable features for advanced congestion control algorithms. In this environment, existing algorithms like DCQCN and TIMELY face two major challenges: (1) a single congestion signal (such as ECN or RTT) struggles to accurately reflect network conditions, leading to delayed congestion detection; (2) heuristic rate control strategies are prone to causing network fluctuations and slow convergence, making it difficult to meet the demands of high-bandwidth links. To address these issues, we propose CCCR, a congestion control algorithm that combines ECN (via CNP) and RTT signals. CCCR enables rapid, accurate rate reduction using receiver-side feedback and employs a adaptive rate increase based on minimum, average, and target RTT. It also adjusts in-flight data using per-flow BDP estimation. Simulations show that compared to DCQCN, TIMELY, and Swift, CCCR reduces the average flow completion time by 11%, 20%, and 12% respectively in incast scenarios, with better fairness than HPCC, and achieves up to 82% reduction in tail flow completion time for medium flows and up to 74% for long flows. In large-scale simulations, CCCR achieves comparable performance to programmable switch-based HPCC algorithms.
CCCR:结合CNP和RTT实现数据中心网络的拥塞控制
随着云计算、人工智能和大数据的快速发展,数据中心网络在实现超低延迟、高带宽和稳定性方面面临着挑战。许多数据中心仍然依赖于传统的交换机,而这种交换机缺乏用于高级拥塞控制算法的可编程功能。在这种环境下,DCQCN和TIMELY等现有算法面临两大挑战:(1)单个拥塞信号(如ECN或RTT)难以准确反映网络状况,导致拥塞检测延迟;(2)启发式速率控制策略容易引起网络波动,收敛速度慢,难以满足高带宽链路的需求。为了解决这些问题,我们提出了CCCR,一种结合ECN(通过CNP)和RTT信号的拥塞控制算法。CCCR使用接收端反馈实现快速、准确的速率降低,并采用基于最小、平均和目标RTT的自适应速率增加。它还使用每流BDP估计来调整飞行中的数据。仿真结果表明,与DCQCN、TIMELY和Swift相比,CCCR在随机场景下的平均流完成时间分别减少了11%、20%和12%,且比HPCC具有更好的公平性,对于中等流尾流完成时间最多减少82%,对于长流尾流完成时间最多减少74%。在大规模仿真中,CCCR达到了与基于可编程开关的HPCC算法相当的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Simulation Modelling Practice and Theory
Simulation Modelling Practice and Theory 工程技术-计算机:跨学科应用
CiteScore
9.80
自引率
4.80%
发文量
142
审稿时长
21 days
期刊介绍: The journal Simulation Modelling Practice and Theory provides a forum for original, high-quality papers dealing with any aspect of systems simulation and modelling. The journal aims at being a reference and a powerful tool to all those professionally active and/or interested in the methods and applications of simulation. Submitted papers will be peer reviewed and must significantly contribute to modelling and simulation in general or use modelling and simulation in application areas. Paper submission is solicited on: • theoretical aspects of modelling and simulation including formal modelling, model-checking, random number generators, sensitivity analysis, variance reduction techniques, experimental design, meta-modelling, methods and algorithms for validation and verification, selection and comparison procedures etc.; • methodology and application of modelling and simulation in any area, including computer systems, networks, real-time and embedded systems, mobile and intelligent agents, manufacturing and transportation systems, management, engineering, biomedical engineering, economics, ecology and environment, education, transaction handling, etc.; • simulation languages and environments including those, specific to distributed computing, grid computing, high performance computers or computer networks, etc.; • distributed and real-time simulation, simulation interoperability; • tools for high performance computing simulation, including dedicated architectures and parallel computing.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信