{"title":"Alleviating Congestion via Switch Design for Fair Buffer Allocation in Datacenters","authors":"Ahmed M. Abdelmoniem;Brahim Bensaou","doi":"10.1109/TCC.2024.3357595","DOIUrl":null,"url":null,"abstract":"In data-centers, the composite origin and bursty nature of traffic, the small bandwidth-delay product and the tiny switch buffers lead to unusual congestion patterns that are not handled well by traditional end-to-end congestion control mechanisms such as those deployed in TCP. Existing works address the problem by modifying TCP to adapt it to the idiosyncrasies of data-centers. While this is feasible in private environments, it remains almost impossible to achieve practically in public multi-tenant clouds where a multitude of operating systems and thus congestion control protocols co-exist. In this work, we design a simple switch-based active queue management scheme to deal with such congestion issues adequately. Our approach requires no modification to TCP which enables seamless deployment in public data-centers via switch firmware updates. We present a simple analysis to show the stability and effectiveness of our approach, then discuss the real implementations in software and hardware on the NetFPGA platform. Numerical results from ns-2 simulation and experimental results from a small testbed cluster demonstrate the effectiveness of our approach in achieving high overall throughput, good fairness, smaller flow completion times (FCT) for short-lived flows, and a significant reduction in the tail of the FCT distribution.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"12 1","pages":"219-231"},"PeriodicalIF":5.3000,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10412648","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cloud Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10412648/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In data-centers, the composite origin and bursty nature of traffic, the small bandwidth-delay product and the tiny switch buffers lead to unusual congestion patterns that are not handled well by traditional end-to-end congestion control mechanisms such as those deployed in TCP. Existing works address the problem by modifying TCP to adapt it to the idiosyncrasies of data-centers. While this is feasible in private environments, it remains almost impossible to achieve practically in public multi-tenant clouds where a multitude of operating systems and thus congestion control protocols co-exist. In this work, we design a simple switch-based active queue management scheme to deal with such congestion issues adequately. Our approach requires no modification to TCP which enables seamless deployment in public data-centers via switch firmware updates. We present a simple analysis to show the stability and effectiveness of our approach, then discuss the real implementations in software and hardware on the NetFPGA platform. Numerical results from ns-2 simulation and experimental results from a small testbed cluster demonstrate the effectiveness of our approach in achieving high overall throughput, good fairness, smaller flow completion times (FCT) for short-lived flows, and a significant reduction in the tail of the FCT distribution.
期刊介绍:
The IEEE Transactions on Cloud Computing (TCC) is dedicated to the multidisciplinary field of cloud computing. It is committed to the publication of articles that present innovative research ideas, application results, and case studies in cloud computing, focusing on key technical issues related to theory, algorithms, systems, applications, and performance.