2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines最新文献_第4页

Enabling Hardware Exploration in Software-Defined Networking: A Flexible, Portable OpenFlow Switch 在软件定义网络中实现硬件探索:一个灵活的、可移植的OpenFlow交换机

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.15

Asif Khan, Nirav H. Dave

引用次数: 30

Exploring Manycore Multinode Systems for Irregular Applications with FPGA Prototyping 利用FPGA原型技术探索不规则应用的多核多节点系统

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.62

Marco Ceriani, G. Palermo, Simone Secchi, Antonino Tumeo, Oreste Villa

引用次数: 5

Accelerating Join Operation for Relational Databases with FPGAs 用fpga加速关系数据库的联接操作

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.17

R. Halstead, Bharat Sukhwani, Hong Min, Mathew S. Thoennes, Parijat Dube, S. Asaad, B. Iyer

引用次数: 64

A Configurable Architecture for a Visual Saliency System and Its Application in Retail 视觉显著性系统的可配置结构及其在零售中的应用

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.41

Nandhini Chandramoorthy, Siddharth Advani, K. Irick, N. Vijaykrishnan

{"title":"A Configurable Architecture for a Visual Saliency System and Its Application in Retail","authors":"Nandhini Chandramoorthy, Siddharth Advani, K. Irick, N. Vijaykrishnan","doi":"10.1109/FCCM.2013.41","DOIUrl":"https://doi.org/10.1109/FCCM.2013.41","url":null,"abstract":"Summary form only given. The objective of this paper is to present a configurable architecture for a visual saliency model based on AIM. It presents algorithmic enhancements to AIM that facilitates the design of a performance-efficient hardware architecture that offers tradeoffs between accuracy, resource utilization and latency. The AIM computational model involves (1) extraction of a set of coefficient features for each local patch in an image, (2) estimation of probability density for each coefficient with respect to its local surround, (3) computation of their product to give a joint likelihood and (4) computation of the self information of each pixel from its log likelihood. Calculation of likelihood with respect to each pixel individually in a local surround is computationally expensive. It proposes to approximate the contribution of pixels in the surround in terms of “cells” grouped further into “support zones”, whose widths are configurable. This approximation leads to nearly a 10x reduction in the number of multipliers, a critical resource, for a 41x41 surround size.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122862895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Accuracy-Performance Tradeoffs on an FPGA through Overclocking 通过超频实现FPGA上的精度-性能权衡

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.10

Kan Shi, D. Boland, G. Constantinides

引用次数: 24

A Delay-based PUF Design Using Multiplexers on FPGA 基于FPGA多路复用器的延时PUF设计

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.11

Miaoqing Huang, Shiming Li

引用次数: 6

Application Composition and Communication Optimization in Iterative Solvers Using FPGAs 基于fpga的迭代求解器的应用组合与通信优化

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.16

A. Rafique, Nachiket Kapre, G. Constantinides

{"title":"Application Composition and Communication Optimization in Iterative Solvers Using FPGAs","authors":"A. Rafique, Nachiket Kapre, G. Constantinides","doi":"10.1109/FCCM.2013.16","DOIUrl":"https://doi.org/10.1109/FCCM.2013.16","url":null,"abstract":"We consider the problem of minimizing communication with off-chip memory and composition of multiple linear algebra kernels in iterative solvers for solving large-scale eigenvalue problems and linear systems of equations. While GPUs may offer higher throughput for individual kernels, overall application performance is limited by the inability to support on-chip sharing of data across kernels. In this paper, we show that higher on-chip memory capacity and superior on-chip communication bandwidth enables FPGAs to better support the composition of a sequence of kernels within these iterative solvers. We present a time-multiplexed FPGA architecture which exploits the on-chip capacity to store dependencies between kernels and high communication bandwidth to move data. We propose a resource-constrained framework to select the optimal value of an algorithmic parameter which provides the tradeoff between communication and computation cost for a particular FPGA. Using the Lanczos Method as a case study, we show how to minimize communication on FPGAs by this tight algorithm-architecture interaction and get superior performance over GPU despite of its ~5x larger off-chip memory bandwidth and ~2x greater peak singleprecision floating-point performance.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116526615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Efficient Large Integer Squarers on FPGA 基于FPGA的高效大整数平方器

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.35

Simin Xu, Suhaib A. Fahmy, I. Mcloughlin

引用次数: 4

The Effect of Compiler Optimizations on High-Level Synthesis for FPGAs 编译器优化对fpga高级合成的影响

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.50

Qijing Huang, Ruolong Lian, Andrew Canis, Jongsok Choi, R. Xi, S. Brown, J. Anderson

引用次数: 51

Safe Overclocking of Tightly Coupled CGRAs and Processor Arrays using Razor 使用Razor实现紧密耦合CGRAs和处理器阵列的安全超频

2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines Pub Date : 2013-04-28 DOI: 10.1109/FCCM.2013.63

Alexander Brant, Ameer Abdelhadi, Douglas H. H. Sim, S. Tang, Michael Xi Yue, G. Lemieux

{"title":"Safe Overclocking of Tightly Coupled CGRAs and Processor Arrays using Razor","authors":"Alexander Brant, Ameer Abdelhadi, Douglas H. H. Sim, S. Tang, Michael Xi Yue, G. Lemieux","doi":"10.1109/FCCM.2013.63","DOIUrl":"https://doi.org/10.1109/FCCM.2013.63","url":null,"abstract":"Overclocking a CPU is a common practice among home-built PC enthusiasts where the CPU is operated at a higher frequency than its speed rating. This practice is unsafe because timing errors cannot be detected by modern CPUs and they can be practically undetectable by the end user. Using a timing speculation technique such as Razor, it is possible to detect timing errors in CPUs. To date, Razor has been shown to correct only unidirectional, feed-forward processor pipelines. In this paper, we safely overclock 2D arrays by extending Razor correction to cover bidirectional communication in a tightly coupled or lockstep fashion. To recover from an error, stall wavefronts are produced which propagate across the device. Multiple errors may arise in close proximity in time and space; if the corresponding stall wavefronts collide, they merge to produce a single unified wavefront, allowing recovery from multiple errors with one stall cycle. We demonstrate the correctness and viability of our approach by constructing a proof-of-concept prototype which runs on a traditional Altera FPGA. Our approach can be applied to custom computing arrays, systolic arrays, CGRAs, and also time-multiplexed FPGAs such as those produced by Tabula. As a result, these devices can be overclocked and safely tolerate dynamic, data-dependent timing errors. Alternatively, instead of overclocking, this same technique can be used to `undervolt' the power supply and save energy.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128253827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13