Q-adaptive: A Multi-Agent Reinforcement Learning Based Routing on Dragonfly Network

Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing Pub Date : 2021-06-21 DOI:10.1145/3431379.3460650

Yao Kang, Xin Wang, Z. Lan

{"title":"Q-adaptive: A Multi-Agent Reinforcement Learning Based Routing on Dragonfly Network","authors":"Yao Kang, Xin Wang, Z. Lan","doi":"10.1145/3431379.3460650","DOIUrl":null,"url":null,"abstract":"High-radix interconnects such as Dragonfly and its variants rely on adaptive routing to balance network traffic for optimum performance. Ideally, adaptive routing attempts to forward packets between minimal and non-minimal paths with the least congestion. In practice, current adaptive routing algorithms estimate routing path congestion based on local information such as output queue occupancy. Using local information to estimate global path congestion is inevitably inaccurate because a router has no precise knowledge of link states a few hops away. This inaccuracy could lead to interconnect congestion. In this study, we present Q-adaptive routing, a multi-agent reinforcement learning routing scheme for Dragonfly systems. Q-adaptive routing enables routers to learn to route autonomously by leveraging advanced reinforcement learning technology. The proposed Q-adaptive routing is highly scalable thanks to its fully distributed nature without using any shared information between routers. Furthermore, a new two-level Q-table is designed for Q-adaptive to make it computational lightly and saves 50% of router memory usage compared with the previous Q-routing. We implement the proposed Q-adaptive routing in SST/Merlin simulator. Our evaluation results show that Q-adaptive routing achieves up to 10.5% system throughput improvement and 5.2x average packet latency reduction compared with adaptive routing algorithms. Remarkably, Q-adaptive can even outperform the optimal VALn non-minimal routing under the ADV+1 adversarial traffic pattern with up to 3% system throughput improvement and 75% average packet latency reduction.","PeriodicalId":343991,"journal":{"name":"Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3431379.3460650","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

High-radix interconnects such as Dragonfly and its variants rely on adaptive routing to balance network traffic for optimum performance. Ideally, adaptive routing attempts to forward packets between minimal and non-minimal paths with the least congestion. In practice, current adaptive routing algorithms estimate routing path congestion based on local information such as output queue occupancy. Using local information to estimate global path congestion is inevitably inaccurate because a router has no precise knowledge of link states a few hops away. This inaccuracy could lead to interconnect congestion. In this study, we present Q-adaptive routing, a multi-agent reinforcement learning routing scheme for Dragonfly systems. Q-adaptive routing enables routers to learn to route autonomously by leveraging advanced reinforcement learning technology. The proposed Q-adaptive routing is highly scalable thanks to its fully distributed nature without using any shared information between routers. Furthermore, a new two-level Q-table is designed for Q-adaptive to make it computational lightly and saves 50% of router memory usage compared with the previous Q-routing. We implement the proposed Q-adaptive routing in SST/Merlin simulator. Our evaluation results show that Q-adaptive routing achieves up to 10.5% system throughput improvement and 5.2x average packet latency reduction compared with adaptive routing algorithms. Remarkably, Q-adaptive can even outperform the optimal VALn non-minimal routing under the ADV+1 adversarial traffic pattern with up to 3% system throughput improvement and 75% average packet latency reduction.

查看原文本刊更多论文

Q-adaptive:基于多智能体强化学习的蜻蜓网络路由

高基数互连(如Dragonfly及其变体)依赖自适应路由来平衡网络流量以获得最佳性能。理想情况下，自适应路由尝试在拥塞最少的最小和非最小路径之间转发数据包。在实际应用中，当前的自适应路由算法是基于输出队列占用等局部信息来估计路由路径拥塞的。使用本地信息来估计全局路径拥塞不可避免地是不准确的，因为路由器对几跳之外的链路状态没有精确的了解。这种不准确性可能导致互连拥塞。在这项研究中，我们提出了q自适应路由，一种用于蜻蜓系统的多智能体强化学习路由方案。q自适应路由使路由器能够利用先进的强化学习技术自主学习路由。所提出的q自适应路由由于其完全分布式的特性，在路由器之间不使用任何共享信息，具有高度可扩展性。此外，为Q-adaptive设计了一个新的两级q表，使其计算量小，与以前的Q-routing相比，节省了50%的路由器内存使用。我们在SST/Merlin模拟器中实现了所提出的q自适应路由。我们的评估结果表明，与自适应路由算法相比，q -自适应路由算法的系统吞吐量提高了10.5%，平均数据包延迟降低了5.2倍。值得注意的是，在ADV+1对抗流量模式下，Q-adaptive甚至可以优于最优的VALn非最小路由，最多可以提高3%的系统吞吐量，减少75%的平均数据包延迟。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing

自引率

0.00%

发文量