FastTrack: Exploiting Fast FPGA Wiring for Implementing NoC Shortcuts (Abstract Only)

Nachiket Kapre, T. Krishna
{"title":"FastTrack: Exploiting Fast FPGA Wiring for Implementing NoC Shortcuts (Abstract Only)","authors":"Nachiket Kapre, T. Krishna","doi":"10.1145/3174243.3174962","DOIUrl":null,"url":null,"abstract":"The latency of packet-switched FPGA overlay Networks-on-Chip (NoCs) goes up linearly with the NoC dimensions, since packets typically spend a cycle in each dynamic router along the path. High-performance FPGA NoCs have to aggressively pipeline interconnects, thereby adding extra latency overhead to the NoC. The use of FPGA-friendly deflection routing schemes further exacerbates latency. Fortunately, FPGAs provide segmented interconnects with different lengths (speeds). Faster FPGA tracks can be used to reduce the number of switchbox hops along the packet path. We introduce FastTrack, an adaption to the NoC organization that inserts express bypass links in the NoC to skip multiple router stages in a single clock cycle. Our FastTrack design can be tuned to support different express link lengths for performance, and depopulation strategies for controlling cost. For the Xilinx Virtex-7 485T FPGA, an 8×8 FastTrack NoC is 2× larger than a base Hoplite NoC, but operates between 1.2-0.8× its clock frequency when using express links of length 2-4. FastTrack delivers throughput and latency improvements across a range of statistical workloads (2-2.5×), and traces extracted from FPGA accelerator case studies such as Sparse Matrix-Vector Multiplication (2.5×), Graph Analytics (2.8×), and Multi-processor overlay applications (2×). FastTrack also shows energy efficiency improvements by factors of up to 2× over baseline Hoplite due to higher sustained rates and high speed operation of express links made possible by fast FPGA interconnect.","PeriodicalId":164936,"journal":{"name":"Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3174243.3174962","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The latency of packet-switched FPGA overlay Networks-on-Chip (NoCs) goes up linearly with the NoC dimensions, since packets typically spend a cycle in each dynamic router along the path. High-performance FPGA NoCs have to aggressively pipeline interconnects, thereby adding extra latency overhead to the NoC. The use of FPGA-friendly deflection routing schemes further exacerbates latency. Fortunately, FPGAs provide segmented interconnects with different lengths (speeds). Faster FPGA tracks can be used to reduce the number of switchbox hops along the packet path. We introduce FastTrack, an adaption to the NoC organization that inserts express bypass links in the NoC to skip multiple router stages in a single clock cycle. Our FastTrack design can be tuned to support different express link lengths for performance, and depopulation strategies for controlling cost. For the Xilinx Virtex-7 485T FPGA, an 8×8 FastTrack NoC is 2× larger than a base Hoplite NoC, but operates between 1.2-0.8× its clock frequency when using express links of length 2-4. FastTrack delivers throughput and latency improvements across a range of statistical workloads (2-2.5×), and traces extracted from FPGA accelerator case studies such as Sparse Matrix-Vector Multiplication (2.5×), Graph Analytics (2.8×), and Multi-processor overlay applications (2×). FastTrack also shows energy efficiency improvements by factors of up to 2× over baseline Hoplite due to higher sustained rates and high speed operation of express links made possible by fast FPGA interconnect.
FastTrack:利用快速FPGA布线实现NoC捷径(仅摘要)
分组交换的FPGA覆盖片上网络(NoC)的延迟随着NoC维度线性上升,因为数据包通常在每个动态路由器上沿路径花费一个周期。高性能FPGA NoC必须积极地进行管道互连,从而为NoC增加了额外的延迟开销。使用fpga友好的偏转路由方案进一步加剧了延迟。幸运的是,fpga提供了不同长度(速度)的分段互连。可以使用更快的FPGA轨道来减少分组路径上的开关箱跳数。我们介绍了FastTrack,这是对NoC组织的一种适应,它在NoC中插入快速旁路链路,以便在单个时钟周期内跳过多个路由器阶段。我们的快速轨道设计可以调整,以支持不同的快速链路长度的性能和减少人口的策略,以控制成本。对于Xilinx Virtex-7 485T FPGA, 8×8 FastTrack NoC比基础Hoplite NoC大2倍,但在使用长度为2-4的快速链路时,其时钟频率在1.2-0.8倍之间。FastTrack在一系列统计工作负载(2-2.5倍)中提供吞吐量和延迟改进,并从FPGA加速器案例研究中提取跟踪,如稀疏矩阵向量乘法(2.5倍),图形分析(2.8倍)和多处理器覆盖应用(2x)。FastTrack还显示,由于快速FPGA互连实现了更高的持续速率和高速运行的快速链路,能效提高了基线Hoplite的2倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信