2009 International Conference on Reconfigurable Computing and FPGAs最新文献

Matrix Multiplication Based on Scalable Macro-Pipelined FPGA Accelerator Architecture 基于可扩展宏流水线FPGA加速架构的矩阵乘法

2009 International Conference on Reconfigurable Computing and FPGAs Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.30

Jiang Jiang, Vincent Mirian, Kam Pui Tang, P. Chow, Zuocheng Xing

引用次数: 20

A Scalable Architecture for Multivariate Polynomial Evaluation on FPGA 基于FPGA的多元多项式求值可扩展架构

2009 International Conference on Reconfigurable Computing and FPGAs Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.22

Mathieu Allard, P. Grogan, J. David

引用次数: 2

Runtime Memory Allocation in a Heterogeneous Reconfigurable Platform 异构可重构平台中的运行时内存分配

2009 International Conference on Reconfigurable Computing and FPGAs Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.38

V. Sima, K. Bertels

{"title":"Runtime Memory Allocation in a Heterogeneous Reconfigurable Platform","authors":"V. Sima, K. Bertels","doi":"10.1109/ReConFig.2009.38","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.38","url":null,"abstract":"In this paper, we present a runtime memory allocation algorithm, that aims to substantially reduce the overhead caused by shared-memory accesses by allocating memory directly in the local scratch pad memories. We target a heterogeneous platform, with a complex memory hierarchy. Using special instrumentation, we determine what memory areas are used in functions that could run on different processing elements, like, for example a reconfigurable logic array. Based on profile information, the programmer annotates some functions as candidates for accelerated execution. Then, an algorithm decides the best allocation, taking into account the various processing elements and special scratch pad memories of the heterogeneous platform. Tests are performed on our prototype platform, a Virtex ML410 with Linux operating system, containing a PowerPC processor and a Xilinx FPGA, implementing the MOLEN programming paradigm. We test the algorithm using both state of the art H.264 video encoder as well as other synthetic applications. The performance improvement for the H.264 application is 14% compared to the software only version while the overhead is less than 1% of the application execution time. This improvement is the optimal improvement that can be obtained by optimizing the memory allocation. For the synthetic applications the results are within 5% of the optimum.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125356744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

A Framework for 2.5D NoC Exploration Using Homogeneous Networks over Heterogeneous Floorplans 基于异构平面上同构网络的2.5D NoC勘探框架

2009 International Conference on Reconfigurable Computing and FPGAs Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.14

V. D. Paulo, Cristinel Ababei

引用次数: 8

Protecting the NOEKEON Cipher against SCARE Attacks in FPGAs by Using Dynamic Implementations 利用动态实现保护fpga中的NOEKEON密码免受SCARE攻击

2009 International Conference on Reconfigurable Computing and FPGAs Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.19

J. Bringer, H. Chabanne, J. Danger

引用次数: 3

A Traversal Cache Framework for FPGA Acceleration of Pointer Data Structures: A Case Study on Barnes-Hut N-body Simulation 一种用于FPGA加速指针数据结构的遍历缓存框架:以Barnes-Hut n体仿真为例

2009 International Conference on Reconfigurable Computing and FPGAs Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.68

J. Coole, J. Wernsing, G. Stitt

{"title":"A Traversal Cache Framework for FPGA Acceleration of Pointer Data Structures: A Case Study on Barnes-Hut N-body Simulation","authors":"J. Coole, J. Wernsing, G. Stitt","doi":"10.1109/ReConFig.2009.68","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.68","url":null,"abstract":"Numerous studies have shown that field-programmable gate arrays (FPGAs) often achieve large speedups compared to microprocessors. However, one significant limitation of FPGAs that has prevented their use on important applications is the requirement for regular memory access patterns. Traversal caches were previously introduced to improve the performance of FPGA implementations of algorithms with irregular memory access patterns, especially those traversing pointer-based data structures. However, a significant limitation of previous traversal caches is that speedup was limited to traversals repeated frequently over time, thus preventing speedup for algorithms without repetition, even if the similarity between traversals was large. This paper presents a new framework that extends traversal caches to enable performance improvements in such cases and provides additional improvements through reduced memory accesses and parallel processing of multiple traversals. Most importantly, we show that, for algorithms with highly similar traversals, the traversal cache framework achieves approximately linear kernel speedup with additional area, thus eliminating the memory bandwidth bottleneck commonly associated with FPGAs. We evaluate the framework using a Barnes-Hut n-body simulation case study, showing application speedups ranging from 12x to 13.5x on a Virtex4 LX100 with projected speedups as high as 40x on today’s largest FPGAs.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"60 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130525664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Efficient Technique for the FPGA Implementation of the AES MixColumns Transformation AES混合列变换的高效FPGA实现技术

2009 International Conference on Reconfigurable Computing and FPGAs Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.52

S. Ghaznavi, C. Gebotys, R. Elbaz

引用次数: 15

A Reconfigurable Architecture for Stereo-Assisted Detection of Point-Features for Robot Mapping 机器人测绘中立体辅助点特征检测的可重构体系

2009 International Conference on Reconfigurable Computing and FPGAs Pub Date : 2009-12-09 DOI: 10.1109/RECONFIG.2009.41

J. Kalomiros, J. Lygouras

引用次数: 4

FPGA Implementations of BCD Multipliers BCD乘法器的FPGA实现

2009 International Conference on Reconfigurable Computing and FPGAs Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.28

G. Sutter, E. Todorovich, G. Bioul, M. Vazquez, J. Deschamps

引用次数: 33

Reconfigurable Hardware Implementation of Arithmetic Modulo Minimal Redundancy Cyclotomic Primes for ECC ECC算法模最小冗余环素数的可重构硬件实现

2009 International Conference on Reconfigurable Computing and FPGAs Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.67

Brian Baldwin, W. Marnane, R. Granger

引用次数: 5