TputCache: High-frequency, multi-way cache for high-throughput FPGA applications

2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI:10.1109/FPL.2013.6645537

Aaron Severance, G. Lemieux

引用次数: 7

Abstract

Throughput processing involves using many different contexts or threads to solve multiple problems or subproblems in parallel, where the size of the problem is large enough that latency can be tolerated. Bandwidth is required to support multiple concurrent executions, however, and utilizing multiple external memory channels is costly. For small working sets, FPGA designers can use on-chip BRAMs achieve the necessary bandwidth without increasing the system cost. Designing algorithms around fixed-size local memories is difficult, however, as there is no graceful fallback if the problem size exceeds the amount of local memory. This paper introduces TputCache, a cache designed to meet the needs of throughput processing on FPGAs, giving the throughput performance of on-chip BRAMs when the problem size fits in local memory. The design utilizes a replay based architecture to achieve high frequency with very low resource overheads.

查看原文本刊更多论文

TputCache:用于高吞吐量FPGA应用的高频多路缓存

吞吐量处理涉及使用许多不同的上下文或线程并行地解决多个问题或子问题，其中问题的大小足够大，可以容忍延迟。但是，需要带宽来支持多个并发执行，并且利用多个外部内存通道的成本很高。对于小型工作集，FPGA设计人员可以使用片上bram实现必要的带宽，而不会增加系统成本。然而，围绕固定大小的本地内存设计算法是困难的，因为如果问题大小超过了本地内存的数量，就没有合适的回退。本文介绍了一种满足fpga吞吐量处理需求的缓存TputCache，给出了当问题大小适合本地存储器时片上bram的吞吐量性能。该设计利用基于重放的架构以非常低的资源开销实现高频率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 23rd International Conference on Field programmable Logic and Applications

自引率

0.00%

发文量