2014 24th International Conference on Field Programmable Logic and Applications (FPL)最新文献_第10页

A high speed design and implementation of dynamically reconfigurable processor using 28NM SOI technology 采用28NM SOI技术的高速动态可重构处理器设计与实现

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927438

Toru Katagiri, H. Amano

引用次数: 5

Exploring architecture parameters for dual-output LUT based FPGAs 探索基于双输出LUT的fpga的架构参数

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927470

Zhenghong Jiang, C. Y. Lin, Liqun Yang, Fei Wang, Haigang Yang

引用次数: 4

A survey of open source processors for FPGAs fpga的开源处理器调查

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927482

Rui Jia, C. Y. Lin, Zhenhong Guo, R. Chen, Fei Wang, Tongqiang Gao, Haigang Yang

引用次数: 12

Biomedical image processing and reconstruction with dataflow computing on FPGAs 基于fpga的数据流计算生物医学图像处理与重建

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927378

F. Grüll, U. Kebschull

引用次数: 8

Heterogeneous Heartbeats: A framework for dynamic management of Autonomous SoCs 异构心跳:自治soc动态管理的框架

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927489

Shane T. Fleming, David B. Thomas

{"title":"Heterogeneous Heartbeats: A framework for dynamic management of Autonomous SoCs","authors":"Shane T. Fleming, David B. Thomas","doi":"10.1109/FPL.2014.6927489","DOIUrl":"https://doi.org/10.1109/FPL.2014.6927489","url":null,"abstract":"Modern computer systems are formed from many interacting systems and heterogeneous components, that face increasing constraints on performance, power consumption, and temperature. Such systems have complex run-time dynamics which cannot easily be predicted or modelled at design-time, creating a need for online dynamic systems management. The Heartbeats API is a popular open source project which provides a standardised way for applications to monitor and publish their progress in multi-core CPU systems, but it does not allow hardware components to be monitored or to observe the progress of other components of the system. This paper presents work which extends the capacities of the Heartbeats API across the whole system while maintaining backwards compatibility with the legacy software API. To demonstrate the framework's capabilities an Autonomous Underwater Vehicle (AUV) case study is explored, where a power-aware HW/SW image processing application is implemented on a reconfigurable SoC and an approximate energy saving of 30% is observed for an example input video. Current progress is also discussed on some applications which build upon the framework, including an CubeSat experiment for an Adaptive Heterogeneous FDIR system that will launch in 2016 by the European Space Agency.","PeriodicalId":172795,"journal":{"name":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131099729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Secure partial dynamic reconfiguration with unsecured external memory 安全部分动态重新配置与不安全的外部存储器

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927477

H. Kashyap, R. Chaves

{"title":"Secure partial dynamic reconfiguration with unsecured external memory","authors":"H. Kashyap, R. Chaves","doi":"10.1109/FPL.2014.6927477","DOIUrl":"https://doi.org/10.1109/FPL.2014.6927477","url":null,"abstract":"This paper proposes a solution to improve the security of the partial dynamic reconfiguration of FPGA, without significantly affecting the reconfiguration performance. The existing solutions for secure partial dynamic reconfiguration on SRAM based FPGAs impact the reconfiguration process and the available resources due to their complex multi-layered partial bitstream validation process. This adversely affects the performance of applications using reconfigurable hardware. The proposed solution uses high performance encryption engines to change the encryption key of the remotely received bitstream by a randomly generated key, unique to each configuration, when storing the bitstream in the external unsecured memory. An additional CBC-MAC authentication mechanism is also considered that combined with the frame-wise error detection mechanism of the configuration port, allows for an improved countermeasure against replay attack and wrongful bitstream usage. The proposed solution introduces a resource overhead of 1.1% in regard to the base reconfigurable system and provides the lowest impact on the reconfiguration process when compared to the related state of the art, achieving a reconfiguration throughput of 2.5 Gbps.","PeriodicalId":172795,"journal":{"name":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131432381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Using buffer-to-BRAM mapping approaches to trade-off throughput vs. memory use 使用缓冲区到ram映射方法来权衡吞吐量与内存使用

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927469

Jasmina Vasiljevic, P. Chow

{"title":"Using buffer-to-BRAM mapping approaches to trade-off throughput vs. memory use","authors":"Jasmina Vasiljevic, P. Chow","doi":"10.1109/FPL.2014.6927469","DOIUrl":"https://doi.org/10.1109/FPL.2014.6927469","url":null,"abstract":"One of the challenges in designing high-performance FPGA applications is fine-tuning the use of limited on-chip memory storage among many buffers in an application. To achieve desired performance and meet the on-chip memory budget requirements, the designer faces the burden of manually assigning application buffers to physical on-chip memories. Mismatches between dimensions (bit-width and depth) of buffers and physical on-chip memories lead to underutilized memories. Memory utilization can be increased via buffer packing - grouping buffers together and implementing them as a single memory, at the expense of data throughput. However, identifying buffer groups that result in the least amount of physical memory is a combinatorial problem with a large search space. This process is time consuming and non-trivial, particularly with a large number of buffers of various depths and bit widths. Previous work [1] introduced a tool that provides high-level pragmas allowing the user to specify global memory requirements, such as an application's on-chip memory budget and data throughput. This paper extends the previous work by introducing two low-level pragmas that specify information about memory access patterns, resulting in an improved on-chip memory utilization up to 22%. Further, we develop a simulated annealing based buffer packing algorithm, which reduces the tool's run-time from over 30 mins down to 15 sec, with an improvement in performance in the generated memory solution. Finally, we demonstrate the effectiveness of our tool with four stream application benchmarks.","PeriodicalId":172795,"journal":{"name":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134452667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Caching memcached at reconfigurable network interface 在可重构网络接口上缓存memcached

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927487

E. Fukuda, Hiroaki Inoue, Takashi Takenaka, Dahoo Kim, Tsunaki Sadahisa, T. Asai, M. Motomura

{"title":"Caching memcached at reconfigurable network interface","authors":"E. Fukuda, Hiroaki Inoue, Takashi Takenaka, Dahoo Kim, Tsunaki Sadahisa, T. Asai, M. Motomura","doi":"10.1109/FPL.2014.6927487","DOIUrl":"https://doi.org/10.1109/FPL.2014.6927487","url":null,"abstract":"Memcached is a technology that improves response speed of web servers by caching data on DRAMs in distributed servers. In order to achieve higher performance, memcached has been evaluated on various platforms. Among them, FPGA seems to be the most efficient platform to run memcached, and several research groups are trying to achieve higher throughput with it. However, it is difficult to utilize a large amount of memory (several dozen gigabytes) with an FPGA. Some groups are trying to solve this problem by using an embedded CPU for memory allocation and another group is employing an SSD. Unlike other approaches that try to replace memcached itself on FPGAs, our approach augments the software memcached running on the host CPU by caching its data and some operations at the FPGA-equipped network interface card (NIC) mounted on the server. The locality of memcached data enables the FPGA NIC to have a fairly high hit rate with a smaller memory. We first explore the cache parameters by software simulations and estimate the effectiveness of our approach, and then prototype a system to prove its effectiveness. Through our evaluation with YCSB, a standard key-value store (KVS) benchmarking tool, we estimate that the latency improved by an order of magnitude over software memcached running on a high performance CPU.","PeriodicalId":172795,"journal":{"name":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132901424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31

Transparent insertion of latency-oblivious logic onto FPGAs 透明插入延迟无关逻辑到fpga

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927497

Eddie Hung, T. Todman, W. Luk

{"title":"Transparent insertion of latency-oblivious logic onto FPGAs","authors":"Eddie Hung, T. Todman, W. Luk","doi":"10.1109/FPL.2014.6927497","DOIUrl":"https://doi.org/10.1109/FPL.2014.6927497","url":null,"abstract":"We present an approach for inserting latency-oblivious functionality into pre-existing FPGA circuits transparently. To ensure transparency - that such modifications do not affect the design's maximum clock frequency - we insert any additional logic post place-and-route, using only the spare resources that were not consumed by the pre-existing circuit. The typical challenge with adding new functionality into existing circuits incrementally is that spare FPGA resources to host this functionality must be located close to the input signals that it requires, in order to minimise the impact of routing delays. In congested designs, however, such co-location is often not possible. We overcome this challenge by using flow techniques to pipeline and route signals from where they originate, potentially in a region of high resource congestion, into a region of low congestion capable of hosting new circuitry, at the expense of latency. We demonstrate and evaluate our approach by augmenting realistic designs with self-monitoring circuitry, which is not sensitive to latency. We report results on circuits operating over 200MHz and show that our insertions have no impact on timing, are 2-4 times faster than compile-time insertion, and incur only a small power overhead.","PeriodicalId":172795,"journal":{"name":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114542637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

RAM-based hardware accelerator for network data anonymization 基于ram的硬件加速器，用于网络数据匿名化

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927400

Fumito Yamaguchi, Kanae Matsui, H. Nishi

引用次数: 4