Jiao Zhang, Yali Zhang, Zixuan Guan, Zirui Wan, Yinben Xia, Tian Pan, Tao Huang, Dezhi Tang, Yun Lin
{"title":"HierCC:分级RDMA拥塞控制","authors":"Jiao Zhang, Yali Zhang, Zixuan Guan, Zirui Wan, Yinben Xia, Tian Pan, Tao Huang, Dezhi Tang, Yun Lin","doi":"10.1145/3469393.3469396","DOIUrl":null,"url":null,"abstract":"RDMA has been increasingly deployed in data centers to decrease latency and CPU utilization. However, existing RDMA congestion control schemes fail to address instantaneous large queue build-up or bandwidth under-utilization associated with frequent traffic bursty. In this paper, we argue that traffic uncertainty is the essential reason that constrains data center congestion control from simultaneously achieving high throughput and deterministic latency. Since aggregated flows within the same rack are relatively long-lived, we propose HierCC, which aggregates flows destined to the same IP in a rack and hierarchically controls the rate of flows. The rate of aggregate flows between racks is controlled by a credit-based congestion control mechanism. Then the bandwidth obtained by an aggregate flow in a rack is allocated to the corresponding individual flows from that rack promptly and accurately. We evaluate HierCC using SystemC and large-scale NS3 simulations. Results indicate that HierCC can significantly mitigate buffer usage and reduce the 99th percentile FCT by up to 20% and 40% compared with HPCC and DCQCN under a realistic workload, respectively.","PeriodicalId":291942,"journal":{"name":"5th Asia-Pacific Workshop on Networking (APNet 2021)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HierCC: Hierarchical RDMA Congestion Control\",\"authors\":\"Jiao Zhang, Yali Zhang, Zixuan Guan, Zirui Wan, Yinben Xia, Tian Pan, Tao Huang, Dezhi Tang, Yun Lin\",\"doi\":\"10.1145/3469393.3469396\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"RDMA has been increasingly deployed in data centers to decrease latency and CPU utilization. However, existing RDMA congestion control schemes fail to address instantaneous large queue build-up or bandwidth under-utilization associated with frequent traffic bursty. In this paper, we argue that traffic uncertainty is the essential reason that constrains data center congestion control from simultaneously achieving high throughput and deterministic latency. Since aggregated flows within the same rack are relatively long-lived, we propose HierCC, which aggregates flows destined to the same IP in a rack and hierarchically controls the rate of flows. The rate of aggregate flows between racks is controlled by a credit-based congestion control mechanism. Then the bandwidth obtained by an aggregate flow in a rack is allocated to the corresponding individual flows from that rack promptly and accurately. We evaluate HierCC using SystemC and large-scale NS3 simulations. Results indicate that HierCC can significantly mitigate buffer usage and reduce the 99th percentile FCT by up to 20% and 40% compared with HPCC and DCQCN under a realistic workload, respectively.\",\"PeriodicalId\":291942,\"journal\":{\"name\":\"5th Asia-Pacific Workshop on Networking (APNet 2021)\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"5th Asia-Pacific Workshop on Networking (APNet 2021)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3469393.3469396\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"5th Asia-Pacific Workshop on Networking (APNet 2021)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3469393.3469396","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
RDMA has been increasingly deployed in data centers to decrease latency and CPU utilization. However, existing RDMA congestion control schemes fail to address instantaneous large queue build-up or bandwidth under-utilization associated with frequent traffic bursty. In this paper, we argue that traffic uncertainty is the essential reason that constrains data center congestion control from simultaneously achieving high throughput and deterministic latency. Since aggregated flows within the same rack are relatively long-lived, we propose HierCC, which aggregates flows destined to the same IP in a rack and hierarchically controls the rate of flows. The rate of aggregate flows between racks is controlled by a credit-based congestion control mechanism. Then the bandwidth obtained by an aggregate flow in a rack is allocated to the corresponding individual flows from that rack promptly and accurately. We evaluate HierCC using SystemC and large-scale NS3 simulations. Results indicate that HierCC can significantly mitigate buffer usage and reduce the 99th percentile FCT by up to 20% and 40% compared with HPCC and DCQCN under a realistic workload, respectively.