Song Zhang;Lide Suo;Wenxin Li;Yuan Liu;Yulong Li;Keqiu Li
{"title":"Anole: Scheduling Flows for Fast Datacenter Networks With Packet Re-Prioritization","authors":"Song Zhang;Lide Suo;Wenxin Li;Yuan Liu;Yulong Li;Keqiu Li","doi":"10.1109/TCC.2024.3376716","DOIUrl":null,"url":null,"abstract":"Many existing datacenter transports perform one-shot packet priority tagging at end-hosts and leave them fixed during the packet's transmission. In this article, we experimentally show that: 1) such fixed packet priority is not sufficient for FCT (flow completion time) minimization, and 2) adjusting packet transmission priority in the network requires effective coordination among switches. Building on these insights, we present Anole, a new datacenter transport that advocates packet re-prioritization in near-bottleneck switches to minimize FCT. To this end, Anole integrates three simple-yet-effective techniques. First, it employs an in-network telemetry (INT) based approach to dynamically detect the bottleneck for each flow. Second, it adopts an on-off rate control mechanism for each sender to pause heavily congested flows but send lightly- and non-congested ones. Last, it leverages an altruistic scheduling policy at each switch to let the flows whose next hops are bottleneck switches give way to others. We implement an Anole prototype based on DPDK and show, through both testbed experiments and simulations, that Anole delivers significant performance advantages. For example, compared to EPN, Homa, and Aeolus, it shortens the average FCT of all (small) flows by up to 61.6% (89.1%).","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"12 2","pages":"550-562"},"PeriodicalIF":5.3000,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cloud Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10472057/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Many existing datacenter transports perform one-shot packet priority tagging at end-hosts and leave them fixed during the packet's transmission. In this article, we experimentally show that: 1) such fixed packet priority is not sufficient for FCT (flow completion time) minimization, and 2) adjusting packet transmission priority in the network requires effective coordination among switches. Building on these insights, we present Anole, a new datacenter transport that advocates packet re-prioritization in near-bottleneck switches to minimize FCT. To this end, Anole integrates three simple-yet-effective techniques. First, it employs an in-network telemetry (INT) based approach to dynamically detect the bottleneck for each flow. Second, it adopts an on-off rate control mechanism for each sender to pause heavily congested flows but send lightly- and non-congested ones. Last, it leverages an altruistic scheduling policy at each switch to let the flows whose next hops are bottleneck switches give way to others. We implement an Anole prototype based on DPDK and show, through both testbed experiments and simulations, that Anole delivers significant performance advantages. For example, compared to EPN, Homa, and Aeolus, it shortens the average FCT of all (small) flows by up to 61.6% (89.1%).
期刊介绍:
The IEEE Transactions on Cloud Computing (TCC) is dedicated to the multidisciplinary field of cloud computing. It is committed to the publication of articles that present innovative research ideas, application results, and case studies in cloud computing, focusing on key technical issues related to theory, algorithms, systems, applications, and performance.