Simple virtual channel allocation for high throughput and high frequency on-chip routers

HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture Pub Date : 2010-04-01 DOI:10.1145/2742349

Yi Xu, Bo Zhao, Youtao Zhang, Jun Yang

{"title":"Simple virtual channel allocation for high throughput and high frequency on-chip routers","authors":"Yi Xu, Bo Zhao, Youtao Zhang, Jun Yang","doi":"10.1145/2742349","DOIUrl":null,"url":null,"abstract":"Technology scaling has led to the integration of many cores into a single chip. As a result, on-chip interconnection networks start to play a more and more important role in determining the performance and power of the entire chip. Packet-switched network-on-chip (NoC) has provided a scalable solution to the communications for tiled multi-core processors. However the virtual-channel (VC) buffers in the NoC consume significant dynamic and leakage power of the system. To improve the energy efficiency of the router design, it is advantageous to use small buffer sizes while still maintaining throughput of the network. This paper proposes two new virtual channel allocation (VA) mechanisms, termed Fixed VC Assignment with Dynamic VC Allocation (FVADA) and Adjustable VC Assignment with Dynamic VC Allocation (AVADA). The idea is that VCs are assigned based on the designated output port of a packet to reduce the Head-of-Line (HoL) blocking. Also, the number of VCs allocated for each output port can be adjusted dynamically. Unlike previous buffer-pool based designs, we only use a small number of VCs to keep the arbitration latency low. Simulation results show that FVADA and AVADA can improve the network throughput by 41% on average, compared to a baseline design with the same buffer size. AVADA can still outperform the baseline even when our buffer size is halved. Moreover, we are able to achieve comparable or better throughput than a previous dynamic VC allocator while reducing its critical path delay by 60%. Our results prove that the proposed VA mechanisms are suitable for low-power, high-throughput, and high-frequency on-chip network designs.","PeriodicalId":368621,"journal":{"name":"HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"51","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2742349","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 51

Abstract

Technology scaling has led to the integration of many cores into a single chip. As a result, on-chip interconnection networks start to play a more and more important role in determining the performance and power of the entire chip. Packet-switched network-on-chip (NoC) has provided a scalable solution to the communications for tiled multi-core processors. However the virtual-channel (VC) buffers in the NoC consume significant dynamic and leakage power of the system. To improve the energy efficiency of the router design, it is advantageous to use small buffer sizes while still maintaining throughput of the network. This paper proposes two new virtual channel allocation (VA) mechanisms, termed Fixed VC Assignment with Dynamic VC Allocation (FVADA) and Adjustable VC Assignment with Dynamic VC Allocation (AVADA). The idea is that VCs are assigned based on the designated output port of a packet to reduce the Head-of-Line (HoL) blocking. Also, the number of VCs allocated for each output port can be adjusted dynamically. Unlike previous buffer-pool based designs, we only use a small number of VCs to keep the arbitration latency low. Simulation results show that FVADA and AVADA can improve the network throughput by 41% on average, compared to a baseline design with the same buffer size. AVADA can still outperform the baseline even when our buffer size is halved. Moreover, we are able to achieve comparable or better throughput than a previous dynamic VC allocator while reducing its critical path delay by 60%. Our results prove that the proposed VA mechanisms are suitable for low-power, high-throughput, and high-frequency on-chip network designs.

查看原文本刊更多论文

简单的虚拟通道分配高吞吐量和高频片上路由器

技术的扩展使得许多核心集成到一个芯片上。因此，片上互连网络开始在决定整个芯片的性能和功耗方面发挥越来越重要的作用。分组交换片上网络(NoC)为平铺式多核处理器的通信提供了一种可扩展的解决方案。然而，虚拟通道(VC)缓冲区在NoC中消耗了大量的系统动态功率和泄漏功率。为了提高路由器设计的能量效率，在保持网络吞吐量的同时使用较小的缓冲区是有利的。本文提出了两种新的虚拟信道分配机制，即固定VC分配与动态VC分配(FVADA)和可变VC分配与动态VC分配(AVADA)。其想法是，vc是根据数据包的指定输出端口分配的，以减少线首(HoL)阻塞。此外，每个输出端口分配的VCs数量可以动态调整。与以前基于缓冲池的设计不同，我们只使用少量的vc来保持较低的仲裁延迟。仿真结果表明，与相同缓冲区大小的基线设计相比，FVADA和AVADA可将网络吞吐量平均提高41%。即使我们的缓冲区大小减半，AVADA仍然可以优于基线。此外，我们能够实现与以前的动态VC分配器相当或更好的吞吐量，同时将其关键路径延迟减少60%。我们的研究结果证明，所提出的可变电压机制适用于低功耗、高吞吐量和高频片上网络设计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture

自引率

0.00%

发文量