Ketong Wu, Dezun Dong, Cunlu Li, Shan Huang, Yi Dai
{"title":"Network Congestion Avoidance through Packet-chaining Reservation","authors":"Ketong Wu, Dezun Dong, Cunlu Li, Shan Huang, Yi Dai","doi":"10.1145/3337821.3337874","DOIUrl":null,"url":null,"abstract":"Endpoint congestion is a bottleneck in high-performance computing (HPC) networks and severely impacts system performance, especially for latency-sensitive applications. For long messages (or flows) whose duration is far larger than the round-trip time (RTT), endpoint congestion can be effectively mitigated by proactive or reactive counter-measures such that the injection rate of each source is dynamically controlled to a proper level. However, many HPC applications produce a hybrid traffic, a mix of short and long messages, and are dominated by short messages. Existing proactive congestion avoidance methods face the great challenge of scheduling the rapidly changing traffic pattern caused by these short messages. In this paper, we leverage the advantages of proactive and reactive congestion avoidance techniques and propose the Packet-chaining Reservation Protocol (PCRP) to make a dynamic balance between flows following proactive scheduling and packets subjected to reactive network conditions. We select the chaining packets as a flexible reservation granularity between the whole flow and one packet. We allow small flows to be speculatively transmitted without being discarded and give them higher priority over the entire network. Our PCRP can respond quickly to network conditions and effectively avoid the formation of endpoint congestion and reduce the average flow delay. We conduct extensive experiments to evaluate our PCRP and compare it with the state-of-the-art proactive reservation-based protocols, Speculative Reservation Protocol (SRP) and Bilateral Flow Reservation Protocol (BFRP). The simulation results demonstrate that in our design the flow latency can be reduced by 50.2% for hotspot traffic and 28.38% for uniform traffic.","PeriodicalId":405273,"journal":{"name":"Proceedings of the 48th International Conference on Parallel Processing","volume":"71 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 48th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3337821.3337874","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Endpoint congestion is a bottleneck in high-performance computing (HPC) networks and severely impacts system performance, especially for latency-sensitive applications. For long messages (or flows) whose duration is far larger than the round-trip time (RTT), endpoint congestion can be effectively mitigated by proactive or reactive counter-measures such that the injection rate of each source is dynamically controlled to a proper level. However, many HPC applications produce a hybrid traffic, a mix of short and long messages, and are dominated by short messages. Existing proactive congestion avoidance methods face the great challenge of scheduling the rapidly changing traffic pattern caused by these short messages. In this paper, we leverage the advantages of proactive and reactive congestion avoidance techniques and propose the Packet-chaining Reservation Protocol (PCRP) to make a dynamic balance between flows following proactive scheduling and packets subjected to reactive network conditions. We select the chaining packets as a flexible reservation granularity between the whole flow and one packet. We allow small flows to be speculatively transmitted without being discarded and give them higher priority over the entire network. Our PCRP can respond quickly to network conditions and effectively avoid the formation of endpoint congestion and reduce the average flow delay. We conduct extensive experiments to evaluate our PCRP and compare it with the state-of-the-art proactive reservation-based protocols, Speculative Reservation Protocol (SRP) and Bilateral Flow Reservation Protocol (BFRP). The simulation results demonstrate that in our design the flow latency can be reduced by 50.2% for hotspot traffic and 28.38% for uniform traffic.