FPGA. ACM International Symposium on Field-Programmable Gate Arrays最新文献_第4页

Building zynq® accelerators with Vivado® high level synthesis 构建zynq®加速器与Vivado®高级合成

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435266

S. Neuendorffer, F. Martinez-Vallina

引用次数: 27

Precision fault injection method based on correspondence between configuration bitstream and architecture (abstract only) 基于配置位流与体系结构对应关系的精确故障注入方法(仅摘要)

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435317

Jing Zhou, Lei Chen, Shuo Wang

{"title":"Precision fault injection method based on correspondence between configuration bitstream and architecture (abstract only)","authors":"Jing Zhou, Lei Chen, Shuo Wang","doi":"10.1145/2435264.2435317","DOIUrl":"https://doi.org/10.1145/2435264.2435317","url":null,"abstract":"SRAM-based FPGAs are increasingly being used; however they are susceptible to SEUs. To emulate the effects of SEUs, a variety of fault injection techniques have been studied. As fault injection process helps little to SEU mechanism study. For further study, a novel Automated Precision Fault Injection System (APFIS) has been developed by Beijing Microelectronics Technology Institute (BMTI), which is engaged in the design, test, package, failure analysis of the Large-scale integration (LSI) and Very Large Scale Integration (VLSI). However, the APFIS is not precise enough. As a result, a more accurate precision fault injection method is studied in this paper. The Automated Precision Fault Injection System-II (APFIS-II) based on this method is made. As early Xilinx devices are still used in special applications without such useful tools, which allowing users to optimize their design conveniently. In this paper, APFIS-II is implemented with Virtex device to improve the reliability of system which contains early devices. The detailed information about the FPGA architecture and configuration bitstream is analyzed. After that, the correspondence between the FPGA resources on-chip and the configuration bitstream is drawn. According to the corresponding relationship, the bitstream is divided into several segments. By APFIS-II, faults are accurately injected into a certain segment instead of the entire bitstream. As a result, faults are able to be injected into a certain resource on-chip. Through this method, the fault injection process is more effective and more targeted, which helps a lot to the study of SEU mechanism and the mitigation techniques.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"27 1","pages":"267"},"PeriodicalIF":0.0,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78672373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A memory-efficient hardware architecture for real-time feature detection of the SIFT algorithm (abstract only) 一种用于SIFT算法实时特征检测的内存高效硬件架构(仅摘要)

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435332

Wenjuan Deng, Yiqun Zhu

引用次数: 1

Achieving modular dynamic partial reconfiguration with a difference-based flow (abstract only) 使用基于差异的流实现模块化动态部分重新配置(仅抽象)

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435324

Sezer Gören, Yusuf Turk, Ozgur Ozkurt, Abdullah Yildiz, H. F. Ugurdag

{"title":"Achieving modular dynamic partial reconfiguration with a difference-based flow (abstract only)","authors":"Sezer Gören, Yusuf Turk, Ozgur Ozkurt, Abdullah Yildiz, H. F. Ugurdag","doi":"10.1145/2435264.2435324","DOIUrl":"https://doi.org/10.1145/2435264.2435324","url":null,"abstract":"Dynamic Partial Reconfiguration (DPR) of Xilinx FPGAs in cases where there is significant logic difference between subsequent configurations is made possible by Xilinx module-based PR flow. Xilinx supports this flow only for high-end FPGAs and requires paid license, without which Xilinx PlanAhead software disables the related knobs and features. This poster presents a unique methodology (called DPR-LD) that enables DPR of low-end and high-end Xilinx FPGAs and requires no paid license. DPR-LD stands for DPR for Large Differences. DPR-LD uses the free Xilinx difference-based bit file generation software (bitgen), which normally is meant only for small differences between subsequent configurations. DPR-LD can be realized through either FPGA Editor or PlanAhead. Our FPGA Editor flow requires several physical constraints to ensure contention-free implementation of static and dynamic modules. We use implementation, floorplanning, and placement constraints to partition the design into several physical regions (one per module) for mapping, packing, placement, and routing. In order to avoid routing of a module to cross over another module, \"fortress block\"s are used to isolate the modules from each other. However, fortress blocks lead to wasted FPGA resources. On the other hand, in our PlanAhead flow, the physical constraints are entered via a GUI, and the corresponding actual physical constraints are generated automatically and without wasting FPGA resources. To evaluate the two approaches, a proof-of-concept application with a single dynamic region was implemented using both flows. In addition, a multiple dynamic region design was implemented with our PlanAhead flow.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"59 1","pages":"270"},"PeriodicalIF":0.0,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85227297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Harnessing the power of FPGAs using altera's OpenCL compiler 利用altera的OpenCL编译器利用fpga的强大功能

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435268

Deshanand P. Singh, Tomasz S. Czajkowski, A. Ling

{"title":"Harnessing the power of FPGAs using altera's OpenCL compiler","authors":"Deshanand P. Singh, Tomasz S. Czajkowski, A. Ling","doi":"10.1145/2435264.2435268","DOIUrl":"https://doi.org/10.1145/2435264.2435268","url":null,"abstract":"In recent years, Field-Programmable Gate Arrays have become extremely powerful computational platforms that can efficiently solve many complex problems. The most modern FPGAs comprise effectively millions of programmable elements, signal processing elements and high-speed interfaces, all of which are necessary to deliver a complete solution. The power of FPGAs is unlocked via low-level programming languages such as VHDL and Verilog, which allow designers to explicitly specify the behavior of each programmable element. While these languages provide a means to create highly efficient logic circuits, they are akin to \"assembly language\" programming for modern processors. This is a serious limiting factor for both productivity and the adoption of FPGAs on a wider scale. In this talk, we use the OpenCL language to explore techniques that allow us to program FPGAs at a level of abstraction closer to traditional software-centric approaches. OpenCL is an industry standard parallel language based on 'C' that offers numerous advantages that enable designers to take full advantage of the capabilities offered by FPGAs, while providing a high-level design entry language that is familiar to a wide range of programmers.\u0000 To demonstrate the advantages a high-level programming language can offer, we demonstrate how to use Altera's OpenCL Compiler on a set of case studies. The first application is single-precision general-element matrix multiplication (SGEMM). It is an example of a highly-parallel algorithm for which an efficient circuit structures are well known. We show how this application can be implemented in OpenCL and how the high-level description can be optimized to generate the most efficient circuit in hardware. The second application is a Fast Fourier Transform (FFT), which is a classical FPGA benchmark that is known to have a good implementation on FPGAs. We show how we can implement the FFT algorithm, while exploring the many different possible architectural choices that lead to an optimized implementation for a given FPGA. Finally, we discuss a Monte-Carlo Black-Scholes simulation, which demonstrates the computational power of FPGAs. We describe how a random number generator in conjunction with computationally intensive operations can be harnessed on an FPGA to generate a high-speed benchmark, which also consumes far less power than the same benchmark running on a comparable GPU. We conclude the tutorial with a set of live demonstrations.\u0000 Through this tutorial we show the benefits high-level languages offer for system-level design and productivity. In particular, Altera's OpenCL compiler is shown to enable high-performance application design that fully utilizes capabilities of modern FPGAs.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"277 1","pages":"5-6"},"PeriodicalIF":0.0,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91539283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

A remote memory access infrastructure for global address space programming models in FPGAs fpga中用于全局地址空间编程模型的远程存储器访问基础结构

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435301

Ruediger Willenberg, P. Chow

引用次数: 11

Low power FPGA design using post-silicon device aging (abstract only) 基于后硅器件老化的低功耗FPGA设计(仅摘要)

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435340

Sheng Wei, J. Zheng, M. Potkonjak

引用次数: 7

Genome sequencing using mapreduce on FPGA with multiple hardware accelerators (abstract only) 基于FPGA的mapreduce基因组测序与多硬件加速器(仅摘要)

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435313

Chao Wang, Xi Li, Xuehai Zhou, Jim Martin, R. Cheung

{"title":"Genome sequencing using mapreduce on FPGA with multiple hardware accelerators (abstract only)","authors":"Chao Wang, Xi Li, Xuehai Zhou, Jim Martin, R. Cheung","doi":"10.1145/2435264.2435313","DOIUrl":"https://doi.org/10.1145/2435264.2435313","url":null,"abstract":"The genome sequencing problem with short reads is an emerging field with seemingly limitless possibilities for advances in numerous scientific research and application domains. It has been the hot topic during the past few years. Growing with the data population and the ease to access for personal users, how to shorten the response interval for short read mapping at a large scale computing domain is extremely important. In this paper we propose a novel FPGA-based acceleration solution with Map-Reduce framework on multiple hardware acceleration engines. The combination of hardware accelerators and Map-Reduce execution flow could greatly expedite the task of aligning short length reads to a known reference genome. Our approach is based on preprocessing the reference genomes and iterative jobs for aligning the continuous incoming reads. The read-mapping algorithm is modeled after the creditable RMAP software approach. Furthermore, theoretical speedup analysis on a MapReduce programming platform is presented, which demonstrates that our proposed architecture has efficient potential to reduce the average waiting time for large scale short reads applications.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"18 2","pages":"266"},"PeriodicalIF":0.0,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72570827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Fully-functional FPGA prototype with fine-grain programmable body biasing 全功能FPGA原型与细颗粒可编程的身体偏置

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435280

M. Hioki, T. Sekigawa, T. Nakagawa, H. Koike, Y. Matsumoto, Takashi Kawanami, T. Tsutsumi

{"title":"Fully-functional FPGA prototype with fine-grain programmable body biasing","authors":"M. Hioki, T. Sekigawa, T. Nakagawa, H. Koike, Y. Matsumoto, Takashi Kawanami, T. Tsutsumi","doi":"10.1145/2435264.2435280","DOIUrl":"https://doi.org/10.1145/2435264.2435280","url":null,"abstract":"A fully-functional FPGA prototype chip in which the programmable body bias voltage can be individually applied to elemental circuits such as MUXes, LUT and DFF is fabricated using low-power 90-nm bulk CMOS technology and the area overhead, dynamic current, static current and operational speed are evaluated in silicon. In measurements, 10 ISCAS benchmark circuits are implemented by employing newly developed CAD tools which consist of VT mapper as well as placer and router. Mask layout shows that well-separated margins, programmable body bias circuits, and additional configuration memories occupy 54% of the FPGA tile area. Measurement results show that the fabricated FPGA reduces the static current by 91.4% in average. In addition, evaluations by implementing ring oscillator with various body bias voltage pairs demonstrate the static current reduction from 23.1 uA to 1.0 uA by assigning low threshold voltage and high threshold voltage to MOSFETs on a critical path and the rest of the MOSFETs, respectively while maintaining the same oscillation frequency of 6.6 MHz as the frequency when all MOSFETs are assigned low threshold voltage. Moreover the fine-grain programmable body bias technique accelerates the oscillation frequency of ring oscillator implemented on FPGA by aggressively applying forward body bias voltage, while assignment of HVT to MOSFETs on the non-critical path by applying the reverse body biasing effectively suppresses exponential increase of static current caused by the forward body biasing.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"10 1","pages":"73-80"},"PeriodicalIF":0.0,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72735824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Custom instruction generation and mapping for reconfigurable instruction set processors (abstract only) 用于可重构指令集处理器的自定义指令生成和映射(仅抽象)

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI: 10.1145/2435264.2435318

Chao Wang, Xi Li, Huizhen Zhang, J. Ji, Xuehai Zhou

{"title":"Custom instruction generation and mapping for reconfigurable instruction set processors (abstract only)","authors":"Chao Wang, Xi Li, Huizhen Zhang, J. Ji, Xuehai Zhou","doi":"10.1145/2435264.2435318","DOIUrl":"https://doi.org/10.1145/2435264.2435318","url":null,"abstract":"Reconfigurable instruction set processors (RISP) is an emerging research field for state-of-the-art adaptive systems. However, it still poses significant challenges to generate and map the custom instructions to the original codes. This paper proposes a generation and mapping scheme to extend custom instructions for adaptive RISP. First a target function blocks (basic blocks) are generated from a dynamic profiler. Then the selected hot spot will be considered as a custom instruction and implemented in reconfigurable hardware logic units. With respect to the instruction selection, an instruction generator is utilized to provide a mapping mechanism from hot blocks to hardware implementations, using data flow analysis, instruction clustering, subgraph enumerating and subgraph merging techniques. Finally the original executable files are recompiled and regenerated by a customized GCC compiler. To demonstrate the effectiveness and performance of the framework, a prototype instruction generator has been implemented to verify the correctness and efficiency of the mapping mechanism.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"110 1","pages":"268"},"PeriodicalIF":0.0,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80552569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0