2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines最新文献

筛选
英文 中文
Enabling Hardware Exploration in Software-Defined Networking: A Flexible, Portable OpenFlow Switch 在软件定义网络中实现硬件探索:一个灵活的、可移植的OpenFlow交换机
Asif Khan, Nirav H. Dave
{"title":"Enabling Hardware Exploration in Software-Defined Networking: A Flexible, Portable OpenFlow Switch","authors":"Asif Khan, Nirav H. Dave","doi":"10.1109/FCCM.2013.15","DOIUrl":"https://doi.org/10.1109/FCCM.2013.15","url":null,"abstract":"The OpenFlow framework allows the data plane of a network switch to be managed by a software-based controller. This enables a software-defined networking model in which sophisticated network management policies can be deployed. In this paper, we present an FPGA-based switch which is fully-compliant with OpenFlow 1.0, and meets the 10 Gbps line rate. The switch design is both modular and highly parametrized. It has generic split-transaction interfaces and isolated platform-specific features, making it both flexible for architectural exploration and portable across FPGA platforms. The flow tables in the switch can be implemented on Block RAM or DRAM without any modifications to the rest of the design. The switch has been ported to the NetFPGA-10G, the ML605 and the DE4 boards. It can be integrated with a Desktop PC via either the PCIe or the serial link, and with an FPGA-based MIPS64 softcore as a coprocessor. The latter FPGA-based switch-processor system provides an ideal platform for network research in which both the data plane and the control plane can be explored.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117027890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Exploring Manycore Multinode Systems for Irregular Applications with FPGA Prototyping 利用FPGA原型技术探索不规则应用的多核多节点系统
Marco Ceriani, G. Palermo, Simone Secchi, Antonino Tumeo, Oreste Villa
{"title":"Exploring Manycore Multinode Systems for Irregular Applications with FPGA Prototyping","authors":"Marco Ceriani, G. Palermo, Simone Secchi, Antonino Tumeo, Oreste Villa","doi":"10.1109/FCCM.2013.62","DOIUrl":"https://doi.org/10.1109/FCCM.2013.62","url":null,"abstract":"We propose an intermediate approach between full custom hardware systems and full-software tools. Figure 1 shows the overview of the proposed architecture. We start from an off-the-shelf architecture composed of simple, in-order cores and an on-chip interconnection. The onchip interconnection interfaces the processing core with the memory controller for the external memory (DDR3) and the shared I/O peripherals. We add three custom components: the Global Memory Access Scheduler (GMAS), the Global Network Interface (GNI) and the Global SYNChronization module (GSYNC). The GMAS enables support for the scrambled address space. It also implements part of the support latency tolerance, storing remote memory operations, and acts as a scheduler for lightweight software multithreading.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133337787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Accelerating Join Operation for Relational Databases with FPGAs 用fpga加速关系数据库的联接操作
R. Halstead, Bharat Sukhwani, Hong Min, Mathew S. Thoennes, Parijat Dube, S. Asaad, B. Iyer
{"title":"Accelerating Join Operation for Relational Databases with FPGAs","authors":"R. Halstead, Bharat Sukhwani, Hong Min, Mathew S. Thoennes, Parijat Dube, S. Asaad, B. Iyer","doi":"10.1109/FCCM.2013.17","DOIUrl":"https://doi.org/10.1109/FCCM.2013.17","url":null,"abstract":"In this paper, we investigate the use of field programmable gate arrays (FPGAs) to accelerate relational joins. Relational join is one of the most CPU-intensive, yet commonly used, database operations. Hashing can be used to reduce the time complexity from quadratic (naïve) to linear time. However, doing so can introduce false positives to the results which must be resolved. We present a hash-join engine on FPGA that performs hashing, conflict resolution, and joining on a PCIe-attached system, achieving greater than 11x speedup over software.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123661718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 64
A Configurable Architecture for a Visual Saliency System and Its Application in Retail 视觉显著性系统的可配置结构及其在零售中的应用
Nandhini Chandramoorthy, Siddharth Advani, K. Irick, N. Vijaykrishnan
{"title":"A Configurable Architecture for a Visual Saliency System and Its Application in Retail","authors":"Nandhini Chandramoorthy, Siddharth Advani, K. Irick, N. Vijaykrishnan","doi":"10.1109/FCCM.2013.41","DOIUrl":"https://doi.org/10.1109/FCCM.2013.41","url":null,"abstract":"Summary form only given. The objective of this paper is to present a configurable architecture for a visual saliency model based on AIM. It presents algorithmic enhancements to AIM that facilitates the design of a performance-efficient hardware architecture that offers tradeoffs between accuracy, resource utilization and latency. The AIM computational model involves (1) extraction of a set of coefficient features for each local patch in an image, (2) estimation of probability density for each coefficient with respect to its local surround, (3) computation of their product to give a joint likelihood and (4) computation of the self information of each pixel from its log likelihood. Calculation of likelihood with respect to each pixel individually in a local surround is computationally expensive. It proposes to approximate the contribution of pixels in the surround in terms of “cells” grouped further into “support zones”, whose widths are configurable. This approximation leads to nearly a 10x reduction in the number of multipliers, a critical resource, for a 41x41 surround size.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122862895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Accuracy-Performance Tradeoffs on an FPGA through Overclocking 通过超频实现FPGA上的精度-性能权衡
Kan Shi, D. Boland, G. Constantinides
{"title":"Accuracy-Performance Tradeoffs on an FPGA through Overclocking","authors":"Kan Shi, D. Boland, G. Constantinides","doi":"10.1109/FCCM.2013.10","DOIUrl":"https://doi.org/10.1109/FCCM.2013.10","url":null,"abstract":"Embedded applications can often demand stringent latency requirements. While high degrees of parallelism within custom FPGA-based accelerators may help to some extent, it may also be necessary to limit the precision used in the datapath to boost the operating frequency of the implementation. However, by reducing the precision, the engineer introduces quantization error into the design. In this paper, we demonstrate that for many applications it would be preferable to simply overclock the design and accept that timing violations may arise. Since the errors introduced by timing violations occur rarely, they will cause less noise than quantization errors. Through the use of analytical models and empirical results on a Xilinx Virtex-6 FPGA, we show that a geometric mean reduction of 67.9% to 98.8% in error expectation or a geometric mean improvement of 3.1% to 27.6% in operating frequency can be obtained using this alternative design methodology.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123771161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
A Delay-based PUF Design Using Multiplexers on FPGA 基于FPGA多路复用器的延时PUF设计
Miaoqing Huang, Shiming Li
{"title":"A Delay-based PUF Design Using Multiplexers on FPGA","authors":"Miaoqing Huang, Shiming Li","doi":"10.1109/FCCM.2013.11","DOIUrl":"https://doi.org/10.1109/FCCM.2013.11","url":null,"abstract":"Summary form only given. Physically unclonable functions (PUFs) have been a hot research topic in hardware-oriented security for many years. Given a challenge as an input to the PUF, it generates a corresponding response, which can be treated as a unique fingerprint or signature for authentication purpose. In this paper, a delay-based PUF design involving multiplexers on FPGA is presented. Due to the intrinsic difference of the switching latencies of two chained multiplexers, a positive pulse may be produced at the output of the downstream multiplexer. This pulse can be used to set the output of a D flip-flop to `1'. The proposed design improves the randomness of the outputs of the PUF.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128688638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Efficient Large Integer Squarers on FPGA 基于FPGA的高效大整数平方器
Simin Xu, Suhaib A. Fahmy, I. Mcloughlin
{"title":"Efficient Large Integer Squarers on FPGA","authors":"Simin Xu, Suhaib A. Fahmy, I. Mcloughlin","doi":"10.1109/FCCM.2013.35","DOIUrl":"https://doi.org/10.1109/FCCM.2013.35","url":null,"abstract":"This paper presents an optimised high throughput architecture for integer squaring on FPGAs. The approach reduces the number of DSP blocks required compared to a standard multiplier. Previous work has proposed the tiling method for double precision squaring, using the least number of DSP blocks so far. However that approach incurs a large overhead in terms of look-up table (LUT) consumption and has a complex and irregular structure that is not suitable for higher word size. The architecture proposed in this paper can reduce DSP block usage by an equivalent amount to the tiling method while incurring a much lower LUT overhead: 21.8% fewer LUTs for a 53-bit squarer. The architecture is mapped to a Xilinx Virtex 6 FPGA and evaluated for a wide range of operand word sizes, demonstrating its scalability and efficiency.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115849487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Application Composition and Communication Optimization in Iterative Solvers Using FPGAs 基于fpga的迭代求解器的应用组合与通信优化
A. Rafique, Nachiket Kapre, G. Constantinides
{"title":"Application Composition and Communication Optimization in Iterative Solvers Using FPGAs","authors":"A. Rafique, Nachiket Kapre, G. Constantinides","doi":"10.1109/FCCM.2013.16","DOIUrl":"https://doi.org/10.1109/FCCM.2013.16","url":null,"abstract":"We consider the problem of minimizing communication with off-chip memory and composition of multiple linear algebra kernels in iterative solvers for solving large-scale eigenvalue problems and linear systems of equations. While GPUs may offer higher throughput for individual kernels, overall application performance is limited by the inability to support on-chip sharing of data across kernels. In this paper, we show that higher on-chip memory capacity and superior on-chip communication bandwidth enables FPGAs to better support the composition of a sequence of kernels within these iterative solvers. We present a time-multiplexed FPGA architecture which exploits the on-chip capacity to store dependencies between kernels and high communication bandwidth to move data. We propose a resource-constrained framework to select the optimal value of an algorithmic parameter which provides the tradeoff between communication and computation cost for a particular FPGA. Using the Lanczos Method as a case study, we show how to minimize communication on FPGAs by this tight algorithm-architecture interaction and get superior performance over GPU despite of its ~5x larger off-chip memory bandwidth and ~2x greater peak singleprecision floating-point performance.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116526615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Safe Overclocking of Tightly Coupled CGRAs and Processor Arrays using Razor 使用Razor实现紧密耦合CGRAs和处理器阵列的安全超频
Alexander Brant, Ameer Abdelhadi, Douglas H. H. Sim, S. Tang, Michael Xi Yue, G. Lemieux
{"title":"Safe Overclocking of Tightly Coupled CGRAs and Processor Arrays using Razor","authors":"Alexander Brant, Ameer Abdelhadi, Douglas H. H. Sim, S. Tang, Michael Xi Yue, G. Lemieux","doi":"10.1109/FCCM.2013.63","DOIUrl":"https://doi.org/10.1109/FCCM.2013.63","url":null,"abstract":"Overclocking a CPU is a common practice among home-built PC enthusiasts where the CPU is operated at a higher frequency than its speed rating. This practice is unsafe because timing errors cannot be detected by modern CPUs and they can be practically undetectable by the end user. Using a timing speculation technique such as Razor, it is possible to detect timing errors in CPUs. To date, Razor has been shown to correct only unidirectional, feed-forward processor pipelines. In this paper, we safely overclock 2D arrays by extending Razor correction to cover bidirectional communication in a tightly coupled or lockstep fashion. To recover from an error, stall wavefronts are produced which propagate across the device. Multiple errors may arise in close proximity in time and space; if the corresponding stall wavefronts collide, they merge to produce a single unified wavefront, allowing recovery from multiple errors with one stall cycle. We demonstrate the correctness and viability of our approach by constructing a proof-of-concept prototype which runs on a traditional Altera FPGA. Our approach can be applied to custom computing arrays, systolic arrays, CGRAs, and also time-multiplexed FPGAs such as those produced by Tabula. As a result, these devices can be overclocked and safely tolerate dynamic, data-dependent timing errors. Alternatively, instead of overclocking, this same technique can be used to `undervolt' the power supply and save energy.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128253827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Acceleration of SQL Restrictions and Aggregations through FPGA-Based Dynamic Partial Reconfiguration 基于fpga的动态局部重构加速SQL约束和聚合
C. Dennl, Daniel Ziener, J. Teich
{"title":"Acceleration of SQL Restrictions and Aggregations through FPGA-Based Dynamic Partial Reconfiguration","authors":"C. Dennl, Daniel Ziener, J. Teich","doi":"10.1109/FCCM.2013.38","DOIUrl":"https://doi.org/10.1109/FCCM.2013.38","url":null,"abstract":"SQL query processing on large database systems is recognized as one of the most important emerging disciplines of computing nowadays. However, current approaches do not provide a substantial coverage of typical query operators in hardware. In this paper, we provide an important step to higher operator coverage by proposing a) full dynamic data path generation for support also complex operators such as restrictions and aggregations. b) Also, an analysis of the computation times of a real database queries when running on a normal desktop computer is proposed to show that c) speedups ranging between 4 and 50 are obtainable by providing generative support also for the important restrict and aggregate operators using FPGAs.","PeriodicalId":269887,"journal":{"name":"2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124099702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信