2013 23rd International Conference on Field programmable Logic and Applications最新文献_第5页

Rapid FPGA design prototyping through preservation of system logic: A case study 通过保存系统逻辑的快速FPGA设计原型:一个案例研究

2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645539

Travis Haroldsen, B. Nelson, Brad White

引用次数: 5

Hardware-accelerated regular expression matching for high-throughput text analytics 用于高吞吐量文本分析的硬件加速正则表达式匹配

2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645534

K. Atasu, R. Polig, C. Hagleitner, Frederick Reiss

引用次数: 22

An automatic FPGA design and implementation framework 一种自动FPGA设计与实现框架

2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645593

Qian Zhao, M. Amagasaki, M. Iida, M. Kuga, T. Sueyoshi

引用次数: 6

A run-time graph-based Polynomial Placement and routing algorithm for virtual FPGAS 基于运行时图的虚拟fpga多项式布局与路由算法

2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645514

R. Ferreira, L. Rocha, A. G. Santos, J. A. Nacif, Stephan Wong, L. Carro

{"title":"A run-time graph-based Polynomial Placement and routing algorithm for virtual FPGAS","authors":"R. Ferreira, L. Rocha, A. G. Santos, J. A. Nacif, Stephan Wong, L. Carro","doi":"10.1109/FPL.2013.6645514","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645514","url":null,"abstract":"Dynamic partial reconfiguration enables efficient use of hardware resources by multiplexing system functionality in time. However, many challenges arise from partial reconfiguration implementation. The placement and routing (P&R) of the hardware modules is a computationally intensive task, and the state-of-art algorithms are not suitable to place and route modules at run-time. This paper makes several contributions: (1) Single Placement at run-time: we propose a novel P&R algorithm based on greedy heuristic where a single placement is performed at run-time in few milliseconds. (2) Implicit Graph Model: the FPGA is modelled as an implicit graph with a direct correspondence to the physical FPGA, and the P&R is performed as a graph mapping problem by exploring the node locality during the depth-first traversal. (3) Polynomial Placement: we show that even a single placement can be routed without critical path degradation. (4) Fragmented Regions: the graph approach is flexible, and it allows efficient placement even onto fragmented FPGA areas. Compared with the most popular P&R tool running the same benchmark suite our algorithm is on average 864x faster. Moreover, the bitstream for partial reconfiguration is also reduced by a factor of 4.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128362606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

A fully pipelined FPGA architecture for stochastic simulation of chemical systems 化学系统随机模拟的全流水线FPGA体系结构

2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645506

David B. Thomas, H. Amano

{"title":"A fully pipelined FPGA architecture for stochastic simulation of chemical systems","authors":"David B. Thomas, H. Amano","doi":"10.1109/FPL.2013.6645506","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645506","url":null,"abstract":"Simulation of chemical systems allows bio-chemists to understand how the interactions of individual molecules can lead to cellular and organism level behaviour. When the concentration of moleculesis very small, it is necessary to model every single chemical interaction in a Monte-Carlo simulation, presenting a huge computational burden. This paper presents a new fully pipelined architecture for chemical simulation, which avoids the traditional approach of optimising for minimum operation count, and instead optimises for throughput and parallelism. We show that even though this leads to a higher asymptotic operation count per simulation step, it allows for a much greater degree of spatial and pipeline parallelism, and the increased area is offset by much greater throughput. The new architecture is implemented in a Virtex-6 SX475T and can sustain a rate of over 1 billion reactions per second for problems with less than 64 reactions. Compared against existing chemical simulators on small to medium size chemical models, the new architecture is 30-100 times faster than a commercial software simulator running on an 8-core 3.4GHz Core i7, and 12-30 times faster than the best existing FPGA simulators.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130497433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Pipelining computing stages in configurable multicore architectures 可配置多核体系结构中的流水线计算阶段

2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645611

A. Azarian

{"title":"Pipelining computing stages in configurable multicore architectures","authors":"A. Azarian","doi":"10.1109/FPL.2013.6645611","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645611","url":null,"abstract":"Recently, there has been increasing interest on using task-level pipelining to accelerate the overall execution of applications mainly consisting of Producer-Consumer (P/C) tasks. In this PhD work we propose an approach to achieve pipelining execution of P/C pairs of tasks in FPGA-based multicore architectures. The current approach is able to speedup the overall execution of successive, data-dependent tasks, by using multiple cores and specific customization features provided by FPGAs. An important component of our approach is the use of customized inter-stage buffer schemes to communicate data and to synchronize the cores associated to the P/C tasks. To improve the performance, we propose a technique to optimize out-of-order communication between P/C pairs when the consumer requests more than once each data element produced, a behavior present in many applications (e.g., image processing). The current FPGA-based experimental results show the feasibility of our approach in both in-order and out-of-order P/C tasks. Moreover, the results using our approach to task-level pipelining and a multicore architecture reveal noticeable performance improvements for a number of benchmarks over a single core implementation without using task-level pipelining.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124624097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A flexible hash table design for 10GBPS key-value stores on FPGAS fpga上10GBPS键值存储的灵活哈希表设计

2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645520

Z. István, G. Alonso, Michaela Blott, K. Vissers

{"title":"A flexible hash table design for 10GBPS key-value stores on FPGAS","authors":"Z. István, G. Alonso, Michaela Blott, K. Vissers","doi":"10.1109/FPL.2013.6645520","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645520","url":null,"abstract":"Common web infrastructure relies on distributed main memory key-value stores to reduce access load on databases, thereby improving both performance and scalability of web sites. As standard cloud servers provide sub-linear scalability and reduced power efficiency to these kinds of scale-out workloads, we have investigated a novel dataflow architecture for key-value stores with the aid of FPGAs which can deliver consistent 10Gbps throughput. In this paper, we present the design of a novel hash table which forms the centre piece of this dataflow architecture. The fully pipelined design can sustain consistent 10Gbps line-rate performance by deploying a concurrent mechanism to handle hash collisions. We address problems such as support for a broad range of key sizes without stalling the pipeline through careful matching of lookup time with packet reception time. Finally, the design is based on a scalable architecture that can be easily parametrized to work with different memory types operating at different access speeds and latencies. We deployed this hash table in a memcached prototype to index 2 million entries in 24GBytes of external DDR3 DRAM while sustaining 13 million requests per second for UDP binary encoded memcached packets which is the maximum packet rate that can be achieved with memcached on a 10Gbps link.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"203 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116152227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 54

Charge recycling for power reduction in FPGA interconnect FPGA互连中降低功耗的电荷回收

2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645509

Safeen Huda, J. Anderson, H. Tamura

引用次数: 5

IOPT-tools — A Web based tool framework for embedded systems controller development using Petri nets 一个基于Web的工具框架，用于使用Petri网开发嵌入式系统控制器

2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645633

L. Gomes, F. Moutinho, F. Pereira

引用次数: 35

Optimizing under abstraction: Using prefetching to improve FPGA performance 抽象下优化:使用预取来提高FPGA性能

2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645522

Hsin-Jung Yang, Kermin Fleming, Michael Adler, J. Emer

引用次数: 8