{"title":"A novel 3D layer-multiplexed on-chip network","authors":"R. Ramanujam, Bill Lin","doi":"10.1145/1882486.1882517","DOIUrl":null,"url":null,"abstract":"Recently, a near-optimal oblivious routing algorithm for 3D mesh networks called Randomized Partially-Minimal (RPM) routing was proposed [12], which works by load-balancing traffic across vertical layers and routing minimally on each horizontal layer. It achieves optimal worst-case throughput when the network radix k is even and within a factor of 1/k2 of optimal when k is odd, and it achieves significantly lower latencies than Valiant routing [18], the best previously known optimal worst-case throughput algorithm. This paper presents a novel layer-multiplexed (LM) architecture for 3D on-chip networks that exploits the optimality of RPM together with the short inter-layer wiring delays enabled in 3D technology. The LM architecture replaces the one-layer-per-hop routing in a 3D mesh with simpler vertical demultiplexing and multiplexing structures. The proposed LM architecture can achieve the same worst-case throughput as a 3D mesh by adapting RPM routing to the LM architecture. However, the LM architecture consumes 27% less power, occupies 27% less area, attains 14.5% higher average throughput, and achieves 33% lower worst-case hop count for a symmetric 4x4x4 mesh topology. On an asymmetric 8 x 8 x 4 mesh, the LM architecture achieves comparable average-case throughput to a 3D mesh, but consumes 26% less power, takes up 27% less area and attains 20% lower worst-case hop count.","PeriodicalId":329300,"journal":{"name":"Symposium on Architectures for Networking and Communications Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Symposium on Architectures for Networking and Communications Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1882486.1882517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Recently, a near-optimal oblivious routing algorithm for 3D mesh networks called Randomized Partially-Minimal (RPM) routing was proposed [12], which works by load-balancing traffic across vertical layers and routing minimally on each horizontal layer. It achieves optimal worst-case throughput when the network radix k is even and within a factor of 1/k2 of optimal when k is odd, and it achieves significantly lower latencies than Valiant routing [18], the best previously known optimal worst-case throughput algorithm. This paper presents a novel layer-multiplexed (LM) architecture for 3D on-chip networks that exploits the optimality of RPM together with the short inter-layer wiring delays enabled in 3D technology. The LM architecture replaces the one-layer-per-hop routing in a 3D mesh with simpler vertical demultiplexing and multiplexing structures. The proposed LM architecture can achieve the same worst-case throughput as a 3D mesh by adapting RPM routing to the LM architecture. However, the LM architecture consumes 27% less power, occupies 27% less area, attains 14.5% higher average throughput, and achieves 33% lower worst-case hop count for a symmetric 4x4x4 mesh topology. On an asymmetric 8 x 8 x 4 mesh, the LM architecture achieves comparable average-case throughput to a 3D mesh, but consumes 26% less power, takes up 27% less area and attains 20% lower worst-case hop count.
最近,针对三维网格网络提出了一种近乎最优的无关路由算法,称为随机部分最小(RPM)路由[12],该算法通过跨垂直层的负载均衡流量和每个水平层的最小路由来工作。当网络基数k为偶数时,它达到最优最坏情况吞吐量,当k为奇数时,它的最优吞吐量在1/k2的范围内,并且它比Valiant路由[18]实现了更低的延迟,Valiant路由是目前已知的最优最坏情况吞吐量算法。本文提出了一种用于3D片上网络的新颖的层多路复用(LM)架构,该架构利用了RPM的最优性以及3D技术中启用的短层间布线延迟。LM架构用更简单的垂直解复用和多路复用结构取代了3D网格中的每跳一层路由。所提出的LM架构通过将RPM路由适应于LM架构,可以实现与3D网格相同的最坏情况吞吐量。然而,LM架构功耗降低27%,占地面积减少27%,平均吞吐量提高14.5%,并且在对称4x4x4网状拓扑下实现最坏情况跳数降低33%。在非对称8 x 8 x 4网格上,LM架构实现了与3D网格相当的平均吞吐量,但功耗降低26%,占地面积减少27%,最坏情况跳数降低20%。