Luis Gonzalez-Naharro, J. Escudero-Sahuquillo, P. García, F. Quiles, J. Duato, Wenhao Sun, Ling Shen, Xiang Yu, Hewen Zheng
{"title":"Efficient Dynamic Isolation of Congestion in Lossless DataCenter Networks","authors":"Luis Gonzalez-Naharro, J. Escudero-Sahuquillo, P. García, F. Quiles, J. Duato, Wenhao Sun, Ling Shen, Xiang Yu, Hewen Zheng","doi":"10.1145/3341558.3342200","DOIUrl":null,"url":null,"abstract":"The architecture of modern DataCenters (DCs) has evolved to meet the stringent communication latency requirements of applications. RDMA technologies such as RoCEv2 have become mainstream to reduce latency, but their performance is impaired in systems with lossy networks due to the overload introduced by packet retransmissions. Thus, lossless networks are increasingly used in DCs to avoid retransmissions delays. However, lossless networks favor the occurrence of congestion, degrading network and system performance. Traditional congestion solutions, such as backpressure or injection throttling, may be ineffective when congestion arises from traffic generated by DC applications. Hence, new efficient congestion management strategies suited to the lossless networks of modern DCs are required. In this paper, we analyze congestion and its negative effects in these scenarios. In addition, we propose and evaluate a congestion management strategy that effectively eliminates the main negative effects of congestion, based on the dynamic isolation of congested flows in special queues. Unlike previous proposals based on this approach, a single special queue is shared by all the congested flows reaching a port. We also propose enhancements to this basic strategy to optimize its efficiency.","PeriodicalId":401123,"journal":{"name":"Proceedings of the ACM SIGCOMM 2019 Workshop on Networking for Emerging Applications and Technologies","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM SIGCOMM 2019 Workshop on Networking for Emerging Applications and Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3341558.3342200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The architecture of modern DataCenters (DCs) has evolved to meet the stringent communication latency requirements of applications. RDMA technologies such as RoCEv2 have become mainstream to reduce latency, but their performance is impaired in systems with lossy networks due to the overload introduced by packet retransmissions. Thus, lossless networks are increasingly used in DCs to avoid retransmissions delays. However, lossless networks favor the occurrence of congestion, degrading network and system performance. Traditional congestion solutions, such as backpressure or injection throttling, may be ineffective when congestion arises from traffic generated by DC applications. Hence, new efficient congestion management strategies suited to the lossless networks of modern DCs are required. In this paper, we analyze congestion and its negative effects in these scenarios. In addition, we propose and evaluate a congestion management strategy that effectively eliminates the main negative effects of congestion, based on the dynamic isolation of congested flows in special queues. Unlike previous proposals based on this approach, a single special queue is shared by all the congested flows reaching a port. We also propose enhancements to this basic strategy to optimize its efficiency.