2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)最新文献_第2页

A LUT-Based Approximate Adder 基于lut的近似加法器

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2016-05-01 DOI: 10.1109/FCCM.2016.16

Andreas Becher, Jorge Echavarria, Daniel Ziener, S. Wildermann, J. Teich

引用次数: 13

Evaluating Embedded FPGA Accelerators for Deep Learning Applications 评估用于深度学习应用的嵌入式FPGA加速器

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2016-05-01 DOI: 10.1109/FCCM.2016.14

Gopalakrishna Hegde, Siddhartha, Nachiappan Ramasamy, Vamsi Buddha, Nachiket Kapre

引用次数: 5

Spatial Predicates Evaluation in the Geohash Domain Using Reconfigurable Hardware 基于可重构硬件的Geohash域空间谓词评估

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2016-05-01 DOI: 10.1109/FCCM.2016.51

Dajung Lee, R. Moussalli, S. Asaad, M. Srivatsa

{"title":"Spatial Predicates Evaluation in the Geohash Domain Using Reconfigurable Hardware","authors":"Dajung Lee, R. Moussalli, S. Asaad, M. Srivatsa","doi":"10.1109/FCCM.2016.51","DOIUrl":"https://doi.org/10.1109/FCCM.2016.51","url":null,"abstract":"As location sensing devices are becoming ubiquitous, overwhelming amounts of data are being produced by the Internet-of-Things-That-Move. Though analyzing this data presents significant business opportunities, new techniques are needed to attain adequate levels of processing performance. One example is the recently introduced geohash geographical coordinate system that is mainly used for indexing. While geohash codes provide useful inherent properties such as hierarchical and variable-precision coding, traditional spatial algorithms operate on data represented using the conventional latitude/longitude geographical coordinate system, and as such do not take advantage of geohash coding. This paper tackles the evaluation of spatial predicates on geometries defined in the geohash domain, as an alternative to the standard Dimensionally Extended Nine-Intersection Model (DE-9IM). We present the first hardware architecture to efficiently evaluate \"contain\" and \"touch\" (internal, external, corner) relations between streams of pairs of geohash codes, in a high throughput (no stall) fashion. Employing FPGAs for exploiting the bit-level granularity of geohash codes, experimental results show (end-to-end) speedup of more than 20× and 90× over highly optimized single-threaded DE-9IM implementations of the contain and touch predicates, respectively. Furthermore, the PCIe-bound FPGA-based solution outperforms a geohash-based multithreaded CPU implementation by ≈1.8× (touch predicate) while using minimal FPGA resources.","PeriodicalId":113498,"journal":{"name":"2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114099172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Finding Space-Time Stream Permutations for Minimum Memory and Latency 寻找最小内存和延迟的时空流排列

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2016-05-01 DOI: 10.1109/FCCM.2016.54

Thaddeus Koehn, P. Athanas

引用次数: 1

Continuous Online Self-Monitoring Introspection Circuitry for Timing Repair by Incremental Partial-Reconfiguration (COSMIC TRIP) 增量部分重构定时修复连续在线自监测自省电路(COSMIC TRIP)

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2016-05-01 DOI: 10.1145/3158229

Hans Giesen, Benjamin Gojman, Raphael Rubin, Ji Kim, A. DeHon

引用次数: 6

Energy Efficiency of Full Pipelining: A Case Study for Matrix Multiplication 全流水线的能量效率:一个矩阵乘法的案例研究

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2016-05-01 DOI: 10.1109/FCCM.2016.50

Peipei Zhou, Hyunseok Park, Zhenman Fang, J. Cong, A. DeHon

引用次数: 7

Cost Effective Partial Scan for Hardware Emulation 成本有效的部分扫描硬件仿真

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2016-05-01 DOI: 10.1109/FCCM.2016.39

Tao Li, Qiang Liu

{"title":"Cost Effective Partial Scan for Hardware Emulation","authors":"Tao Li, Qiang Liu","doi":"10.1109/FCCM.2016.39","DOIUrl":"https://doi.org/10.1109/FCCM.2016.39","url":null,"abstract":"FPGA-based hardware emulation platform runs significantly faster than software simulation for verifying complex circuit designs. However, the controllability and observability of circuit internal signals mapped onto FPGAs are restricted due to the limited chip pins. Scan chain-based technique is effective in providing full-chip controllability and observability, at the cost of large area overhead, especially for FPGAs. Therefore, partial scan has been proposed as an alternative way to improve the controllability and observability while reducing the area cost. However, the optimized partial scan solution with the minimum number of scan flip-flops is not always found. This paper formulates the classical balanced structure partial scan procedure in one step as an integer linear programming problem, leading to the optimized partial scan solution. In addition, partially used logic resources in FPGAs are exploited to implement the extra logic required by the scan chain, to further reduce the area cost. Experimental results show that our partial scan approach can reduce the area overhead by 78.6% and 16.6% compared to the full scan and the existing partial scan approach.","PeriodicalId":113498,"journal":{"name":"2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","volume":"7 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120853874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Power-Efficient Accelerated Genomic Short Read Mapping on Heterogeneous Computing Platforms 异构计算平台上高效加速基因组短读映射

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2016-05-01 DOI: 10.1109/FCCM.2016.17

Ernst Houtgast, V. Sima, G. Marchiori, K. Bertels, Z. Al-Ars

引用次数: 6

Vertex-Centric Graph Processing on FPGA 基于FPGA的顶点中心图处理

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2016-05-01 DOI: 10.1109/FCCM.2016.31

Nina Engelhardt, Hayden Kwok-Hay So

引用次数: 12

DeCO: A DSP Block Based FPGA Accelerator Overlay with Low Overhead Interconnect DeCO:一种基于DSP块的FPGA加速器覆盖和低开销互连

2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2016-05-01 DOI: 10.1109/FCCM.2016.10

A. Jain, Xiangwei Li, P. Singhai, D. Maskell, Suhaib A. Fahmy

{"title":"DeCO: A DSP Block Based FPGA Accelerator Overlay with Low Overhead Interconnect","authors":"A. Jain, Xiangwei Li, P. Singhai, D. Maskell, Suhaib A. Fahmy","doi":"10.1109/FCCM.2016.10","DOIUrl":"https://doi.org/10.1109/FCCM.2016.10","url":null,"abstract":"Coarse-grained FPGA overlay architectures paired with general purpose processors offer a number of advantages for general purpose hardware acceleration because of software-like programmability, fast compilation, application portability, and improved design productivity. However, the area overheads of these overlays, and in particular architectures with island-style interconnect, negate many of these advantages, preventing their use in practical FPGA-based systems. Crucially, the interconnect flexibility provided by these overlay architectures is normally over-provisioned for accelerators based on feed-forward pipelined datapaths, which in many cases have the general shape of inverted cones. We propose DeCO, a cone shaped cluster of FUs utilizing a simple linear interconnect between them. This reduces the area overheads for implementing compute kernels extracted from compute-intensive applications represented as directed acyclic dataflow graphs, while still allowing high data throughput. We perform design space exploration by modeling programmability overhead as a function of overlay design parameters, and compare to the programmability overhead of island-style overlays. We observe 87% savings in LUT requirements using the proposed approach compared to DSP block based island-style overlays. Our experimental evaluation shows that the proposed overlay exhibits an achievable frequency of 395 MHz, close to the DSP theoretical limit on the Xilinx Zynq. We also present an automated tool flow that provides a rapid and vendor-independent mapping of the high level compute kernel code to the proposed overlay.","PeriodicalId":113498,"journal":{"name":"2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116889538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 33