An efficient FPGA-based memory architecture for compute-intensive applications on embedded devices

2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM) Pub Date : 2017-08-01 DOI:10.1109/PACRIM.2017.8121901

S. N. Shahrouzi, D. Perera

{"title":"An efficient FPGA-based memory architecture for compute-intensive applications on embedded devices","authors":"S. N. Shahrouzi, D. Perera","doi":"10.1109/PACRIM.2017.8121901","DOIUrl":null,"url":null,"abstract":"FPGAs are increasingly being utilized to accelerate real-time compute and data intensive applications on embedded platforms. FPGAs achieve high speed-performance by exploiting a variety of parallelisms in computations. However, on-chip memories of current FPGAs are typically dual-port, which hinders multiple simultaneous read/write (R/W) operations required for parallel processing. Although several multi-ported memories are proposed in the literature to address this issue, there is a tradeoff associated with the existing architectures; that is, increasing the number of ports, reduces the total available memory on chip for the block RAMs to store essential data for real-time processing. This tradeoff is not desirable, especially for real-time compute/data intensive applications on embedded platforms, due to the significant amount of time spent on accessing the external memory. In this research work, we introduce a novel and efficient multi-ported memory architecture to bridge the gap between this tradeoff. Experiments are performed to evaluate the feasibility and efficiency of our multi-ported memory architecture. Our unique memory architecture is generic and parameterized. Our memory can be configured to provide a sufficient number of ports for simultaneous R/W operations, while utilizing the total available on-chip memory to store the essential data.","PeriodicalId":308087,"journal":{"name":"2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACRIM.2017.8121901","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

FPGAs are increasingly being utilized to accelerate real-time compute and data intensive applications on embedded platforms. FPGAs achieve high speed-performance by exploiting a variety of parallelisms in computations. However, on-chip memories of current FPGAs are typically dual-port, which hinders multiple simultaneous read/write (R/W) operations required for parallel processing. Although several multi-ported memories are proposed in the literature to address this issue, there is a tradeoff associated with the existing architectures; that is, increasing the number of ports, reduces the total available memory on chip for the block RAMs to store essential data for real-time processing. This tradeoff is not desirable, especially for real-time compute/data intensive applications on embedded platforms, due to the significant amount of time spent on accessing the external memory. In this research work, we introduce a novel and efficient multi-ported memory architecture to bridge the gap between this tradeoff. Experiments are performed to evaluate the feasibility and efficiency of our multi-ported memory architecture. Our unique memory architecture is generic and parameterized. Our memory can be configured to provide a sufficient number of ports for simultaneous R/W operations, while utilizing the total available on-chip memory to store the essential data.

查看原文本刊更多论文

一种高效的基于fpga的内存架构，适用于嵌入式设备上的计算密集型应用

fpga越来越多地用于加速嵌入式平台上的实时计算和数据密集型应用。fpga通过利用计算中的各种并行性来实现高速性能。然而，当前fpga的片上存储器通常是双端口，这阻碍了并行处理所需的多个同时读/写(R/W)操作。尽管文献中提出了几种多端口存储器来解决这个问题，但与现有体系结构相关的权衡;也就是说，增加端口数量会减少芯片上用于块ram存储实时处理所需数据的可用内存总量。这种权衡是不可取的，特别是对于嵌入式平台上的实时计算/数据密集型应用程序，因为要花费大量时间访问外部内存。在这项研究工作中，我们引入了一种新颖而高效的多端口内存架构来弥补这种权衡之间的差距。实验验证了多端口存储架构的可行性和效率。我们独特的内存架构是通用的和参数化的。我们的内存可以配置为提供足够数量的端口来同时进行R/W操作，同时利用芯片上可用的总内存来存储基本数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)

自引率

0.00%

发文量