{"title":"一种高效的基于fpga的内存架构,适用于嵌入式设备上的计算密集型应用","authors":"S. N. Shahrouzi, D. Perera","doi":"10.1109/PACRIM.2017.8121901","DOIUrl":null,"url":null,"abstract":"FPGAs are increasingly being utilized to accelerate real-time compute and data intensive applications on embedded platforms. FPGAs achieve high speed-performance by exploiting a variety of parallelisms in computations. However, on-chip memories of current FPGAs are typically dual-port, which hinders multiple simultaneous read/write (R/W) operations required for parallel processing. Although several multi-ported memories are proposed in the literature to address this issue, there is a tradeoff associated with the existing architectures; that is, increasing the number of ports, reduces the total available memory on chip for the block RAMs to store essential data for real-time processing. This tradeoff is not desirable, especially for real-time compute/data intensive applications on embedded platforms, due to the significant amount of time spent on accessing the external memory. In this research work, we introduce a novel and efficient multi-ported memory architecture to bridge the gap between this tradeoff. Experiments are performed to evaluate the feasibility and efficiency of our multi-ported memory architecture. Our unique memory architecture is generic and parameterized. Our memory can be configured to provide a sufficient number of ports for simultaneous R/W operations, while utilizing the total available on-chip memory to store the essential data.","PeriodicalId":308087,"journal":{"name":"2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"An efficient FPGA-based memory architecture for compute-intensive applications on embedded devices\",\"authors\":\"S. N. Shahrouzi, D. Perera\",\"doi\":\"10.1109/PACRIM.2017.8121901\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"FPGAs are increasingly being utilized to accelerate real-time compute and data intensive applications on embedded platforms. FPGAs achieve high speed-performance by exploiting a variety of parallelisms in computations. However, on-chip memories of current FPGAs are typically dual-port, which hinders multiple simultaneous read/write (R/W) operations required for parallel processing. Although several multi-ported memories are proposed in the literature to address this issue, there is a tradeoff associated with the existing architectures; that is, increasing the number of ports, reduces the total available memory on chip for the block RAMs to store essential data for real-time processing. This tradeoff is not desirable, especially for real-time compute/data intensive applications on embedded platforms, due to the significant amount of time spent on accessing the external memory. In this research work, we introduce a novel and efficient multi-ported memory architecture to bridge the gap between this tradeoff. Experiments are performed to evaluate the feasibility and efficiency of our multi-ported memory architecture. Our unique memory architecture is generic and parameterized. Our memory can be configured to provide a sufficient number of ports for simultaneous R/W operations, while utilizing the total available on-chip memory to store the essential data.\",\"PeriodicalId\":308087,\"journal\":{\"name\":\"2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PACRIM.2017.8121901\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACRIM.2017.8121901","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An efficient FPGA-based memory architecture for compute-intensive applications on embedded devices
FPGAs are increasingly being utilized to accelerate real-time compute and data intensive applications on embedded platforms. FPGAs achieve high speed-performance by exploiting a variety of parallelisms in computations. However, on-chip memories of current FPGAs are typically dual-port, which hinders multiple simultaneous read/write (R/W) operations required for parallel processing. Although several multi-ported memories are proposed in the literature to address this issue, there is a tradeoff associated with the existing architectures; that is, increasing the number of ports, reduces the total available memory on chip for the block RAMs to store essential data for real-time processing. This tradeoff is not desirable, especially for real-time compute/data intensive applications on embedded platforms, due to the significant amount of time spent on accessing the external memory. In this research work, we introduce a novel and efficient multi-ported memory architecture to bridge the gap between this tradeoff. Experiments are performed to evaluate the feasibility and efficiency of our multi-ported memory architecture. Our unique memory architecture is generic and parameterized. Our memory can be configured to provide a sufficient number of ports for simultaneous R/W operations, while utilizing the total available on-chip memory to store the essential data.