2014 24th International Conference on Field Programmable Logic and Applications (FPL)最新文献_第6页

Pipelined compressor tree optimization using integer linear programming 基于整数线性规划的管道压缩机树优化

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927468

M. Kumm, P. Zipf

引用次数: 37

High level programming framework for FPGAs in the data center 数据中心fpga的高级编程框架

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927442

Oren Segal, M. Margala, S. R. Chalamalasetti, M. Wright

{"title":"High level programming framework for FPGAs in the data center","authors":"Oren Segal, M. Margala, S. R. Chalamalasetti, M. Wright","doi":"10.1109/FPL.2014.6927442","DOIUrl":"https://doi.org/10.1109/FPL.2014.6927442","url":null,"abstract":"Heterogeneous computing offers a promising solution for energy efficient computing in the data center. FPGA based heterogeneous computing is an especially promising direction since it allows for the creation of custom hardware solutions for data centric parallel applications. One of the main issues delaying wide spread adoption of FPGAs as main stream high performance computing devices is the difficulty in programming them. OpenCL was meant to address the difficulties and the non-uniformity related to programming heterogeneous devices, unfortunately because of its complexity it sets the bar high for many software programmers, preventing them from directly benefiting from the computing power and energy efficiency that OpenCL and heterogeneous computing have to offer. This work presents an effort to bridge the gap by extending an existing Java programming framework (APARAPI), based on OpenCL, so that it can be used to program FPGAs at a high level of abstraction and increased ease of programmability. We run several real world algorithms to assess the performance of the APARAPI framework on both a low end and a high end system. On the low end and high and systems respectively we find up to 78-80 percent power reduction and 4.8X-5.3X speed increase running NBody simulation, as well as up to 65-80 percent power reduction and 6.2X-7X speed increase for a K-Means MapReduce algorithm running on top of the Hadoop framework and APARAPI.","PeriodicalId":172795,"journal":{"name":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127835661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Configuration approaches to improve computing efficiency of coarse-grained reconfigurable multimedia processor 提高粗粒度可重构多媒体处理器计算效率的配置方法

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927439

Chen Yang, Leibo Liu, Yansheng Wang, S. Yin, Peng Cao, Shaojun Wei

引用次数: 0

A fast and scalable FPGA damage diagnostic service for R3TOS using BIST cloning technique 基于BIST克隆技术的R3TOS快速可扩展FPGA损伤诊断服务

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927386

Ali Ebrahim, T. Arslan, X. Iturbe

{"title":"A fast and scalable FPGA damage diagnostic service for R3TOS using BIST cloning technique","authors":"Ali Ebrahim, T. Arslan, X. Iturbe","doi":"10.1109/FPL.2014.6927386","DOIUrl":"https://doi.org/10.1109/FPL.2014.6927386","url":null,"abstract":"This paper presents a new technique to be used in the context of reconfigurable computing to accelerate the online diagnosis of permanent damage on Xilinx FPGAs using Built-In Self Tests (BISTs). Detecting and locating permanently damaged resources with precision is central to keep the system implemented on the FPGA flawless at all times; i.e. upcoming hardware tasks are mapped to available functional resources, circumventing the use of the damaged ones. The proposed diagnostic technique exploits the Multiple Frame Write (MFW) feature available in Xilinx FPGAs to “clone” (i.e. replicate) a single basic BIST circuit along arbitrarily sized and shaped areas on the FPGA without incurring large time overheads. Hence, the proposed technique allows for creating at runtime on-demand tailored BIST circuits to satisfy any diagnosis requirements that may rise up. Moreover, the proposed solution allows for saving memory in the system as it only requires storing basic BIST circuits. Finally, the paper presents a diagnostic service for a Reliable Reconfigurable Real-Time Operating System (R3TOS) that is based on the BIST cloning technique and works in cooperation with the R3TOS fault-handling and recovery mechanisms.","PeriodicalId":172795,"journal":{"name":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124596459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Achieving portability and efficiency over chip heterogeneous multiprocessor systems 在芯片异构多处理器系统上实现可移植性和效率

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927395

E. Cartwright, A. Sadeghian, Sen Ma, D. Andrews

引用次数: 4

An efficient and flexible host-FPGA PCIe communication library 一个高效灵活的主机- fpga PCIe通信库

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927459

Jian Gong, Tao Wang, Jiahua Chen, Haoyang Wu, Fan Ye, Songwu Lu, J. Cong

{"title":"An efficient and flexible host-FPGA PCIe communication library","authors":"Jian Gong, Tao Wang, Jiahua Chen, Haoyang Wu, Fan Ye, Songwu Lu, J. Cong","doi":"10.1109/FPL.2014.6927459","DOIUrl":"https://doi.org/10.1109/FPL.2014.6927459","url":null,"abstract":"A high-performance interconnection between a host processor and FPGA accelerators is in much demand. Among various interconnection methods, a PCIe bus is an attractive choice for loosely coupled accelerators. Because there is no standard host-FPGA communication library, FPGA developers have to write significant amounts of PCIe related code at both the FPGA side and the host processor side. A high-performance host-FPGA PCIe communication library holds the key to broadening the use of FPGA accelerators. In this paper we target efficiency and flexibility as two important features in such a library. We discuss the challenges in providing these features, and present our solution to these challenges. We propose EPEE, an efficient and flexible host-FPGA PCIe communication library and describe its design. We implemented EPEE in various generations of Xilinx FPGAs with up to 26.24 Gbps half-duplex and 43.02 Gbps full-duplex aggregate throughput in the PCIe Gen2 X8 mode; these are at the best utilization levels that a host-FPGA PCIe library can achieve. The EPEE library has been integrated into four different FPGA applications with different data usage patterns in various institutes.","PeriodicalId":172795,"journal":{"name":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131671568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

Ultrasmall: The smallest MIPS soft processor ultrassmall:最小的MIPS软处理器

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927387

Hiroshi Nakatsuka, Yuichiro Tanaka, Thiem Van Chu, Shinya Takamaeda-Yamazaki, Kenji Kise

{"title":"Ultrasmall: The smallest MIPS soft processor","authors":"Hiroshi Nakatsuka, Yuichiro Tanaka, Thiem Van Chu, Shinya Takamaeda-Yamazaki, Kenji Kise","doi":"10.1109/FPL.2014.6927387","DOIUrl":"https://doi.org/10.1109/FPL.2014.6927387","url":null,"abstract":"Soft processors have been commonly used in FPGAbased designs to perform various useful functions. Some of these functions are not performance-critical and required to be implemented using very few FPGA resources. For such cases, it is desired to reduce circuit area of the soft processor as much as possible. This paper proposes Ultrasmall, a small soft processor for FPGAs. Ultrasmall supports a subset of the MIPS-I ISA and is designed for microcontrollers in FPGA-based SoCs. Ultrasmall employs an area efficient architecture to minimize the use of FPGA resources. While supporting the 32-bit ISA, Ultrasmall adopts the 2-bit wide serial ALU architecture. This approach significantly reduces the amount of FPGA resource usage. In addition to the device-independent optimizations for any FPGAs, we apply primitives-based optimizations for the Xilinx Spartan-3E FPGA series with 4-input LUTs, thereby further reducing the total number of occupied slices. The evaluation result shows that, on the Xilinx Spartan-3E XC3S500E FPGA, Ultrasmall occupies only 137 slices which is 84% of the number of occupied slices of Supersmall, a very small soft processor with the same design concept as Ultrasmall. On the other hand, in term of performance, Ultrasmall is 2.9× faster than Supersmall.","PeriodicalId":172795,"journal":{"name":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128444284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Ready PCIe data streaming solutions for FPGAs 为fpga准备的PCIe数据流解决方案

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927444

Thomas B. Preußer, R. Spallek

引用次数: 10

Fast and accurate SEU-tolerance characterization method for Zynq SoCs Zynq soc的快速、准确的seu容限表征方法

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927416

Igor Villata, U. Bidarte, Uli Kretzschmar, A. Astarloa, Jesús Lázaro

{"title":"Fast and accurate SEU-tolerance characterization method for Zynq SoCs","authors":"Igor Villata, U. Bidarte, Uli Kretzschmar, A. Astarloa, Jesús Lázaro","doi":"10.1109/FPL.2014.6927416","DOIUrl":"https://doi.org/10.1109/FPL.2014.6927416","url":null,"abstract":"In this paper a new SEU (Single Event Upset) emulation method for testing fault tolerant systems in FPGAs is presented. It is implemented on a “Xilinx Zynq®-7000 All Programmable System on Chip (SoC)” device, which combines a hard microprocessor with programmable logic. An important new feature is that an internal hardware configuration interface controlled by this microprocessor is provided. This interface is used for injecting faults into the configuration bitstream in order to emulate radiation effects. Since both the processing system and the programmable logic are in the same chip, this method has the high speed characteristics of internal fault injection methods. As a hard internal configuration interface is provided, a configuration bit belonging to the internal interface port cannot be flipped and injection side effects are avoided. This method is especially suitable for testing complex real fault-tolerant FPGA designs because no substantial modifications need to be added to the original design. A universal verification system is proposed to avoid designing complex external application-dependent testbenches.","PeriodicalId":172795,"journal":{"name":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124897517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Using an OpenCL framework to evaluate interconnect implementations on FPGAs 使用OpenCL框架评估fpga上的互连实现

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI: 10.1109/FPL.2014.6927440

Vincent Mirian, P. Chow

引用次数: 4