Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines最新文献

筛选
英文 中文
Implementing a simple continuous speech recognition system on an FPGA 在FPGA上实现一个简单的连续语音识别系统
S. Melnikoff, S. Quigley, M. J. Russell
{"title":"Implementing a simple continuous speech recognition system on an FPGA","authors":"S. Melnikoff, S. Quigley, M. J. Russell","doi":"10.1109/FPGA.2002.1106682","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106682","url":null,"abstract":"Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. We present an FPGA implementations of the decoder based on continuous hidden Markov models (HMMs) representing monophones, and demonstrate that it can process speech 75 times real time, using 45% of the slices of a Xilinx Virtex XCV1000.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121827041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
The effects of datapath placement and C-slow retiming on three computational benchmarks 数据路径放置和C-slow重定时对三个计算基准的影响
N. Weaver, J. Wawrzynek
{"title":"The effects of datapath placement and C-slow retiming on three computational benchmarks","authors":"N. Weaver, J. Wawrzynek","doi":"10.1109/FPGA.2002.1106694","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106694","url":null,"abstract":"Summary form only given. Two important optimizations within the FPGA design process, C-slow retiming and datapath placement, offer significant benefits for designers. Many have advocated and implemented tools to use these techniques in both automatic and semiautomatic manner but they have not made their way into conventional FPGA toolflows. C-slow retiming is a method of accelerating computations that include feedback loops. Instead of having a single instance of the computation, the feedback loop is pipelined so that C separate instances are all calculated simultaneously. This allows fine grained pipelining to occur even in designs that include feedback loops, such as single round cryptographic implementations or microprocessors. Done properly, it imposes a significant but not imposing latency penalty for single computations while offering huge increases in throughput. Datapath placement is simply constructing the design in a manner that accounts for the higher level data flows. This offers several benefits, including improved performance, more physically compact designs, shorter wires, and faster place and route times when the FPGA is heavily utilized. Even for designs with less structure which are amenable to simulated annealing, datapath placement may still offer a significant benefit. To clearly demonstrate the importance of these optimizations we have hand-modified three computational benchmarks which represent significant themes within FPGA computation: Rijndael/AES encryption, Smith/Waterman, and a simplified 32-bit microprocessor datapath. All three represent significantly different modes of computation within FPGAs, but all gain significantly from the use of these techniques.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122889451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Reconfigurable object detection in FLIR image sequences FLIR图像序列中的可重构目标检测
Jonathan E. Scalera, Creed F. Jones, M. Soni, Mark B. Bucciero, P. Athanas, A. L. Abbott, Amitabh Mishra
{"title":"Reconfigurable object detection in FLIR image sequences","authors":"Jonathan E. Scalera, Creed F. Jones, M. Soni, Mark B. Bucciero, P. Athanas, A. L. Abbott, Amitabh Mishra","doi":"10.1109/FPGA.2002.1106686","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106686","url":null,"abstract":"Future surveillance operations will require easily deployable \"microsensors\" that are capable of autonomous detection and identification of objects. These devices will operate under severe limitations on energy consumption, to enable battery-powered operation. They will assess sensor inputs locally, transmitting data only after objects of interest have been detected and extracted from sensor data. This paper describes a prototype system that detects and tracks moving objects in image sequences obtained from an infrared video camera. Computation in the system is distributed across an FPGA and a DSP chip. The current system analyzes input images, in search of objects that meet predefined criteria. If these criteria are met, the system extracts a sub-image (or \"chip\") that contains an object of interest, and then transmits that to a central site for manual review and further analysis.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"101-B 9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125750365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
GRIP: a reconfigurable architecture for host-based gigabit-rate packet processing GRIP:用于基于主机的千兆速率数据包处理的可重构架构
P. Bellows, J. Flidr, T. Lehman, Brian Schott, K. Underwood
{"title":"GRIP: a reconfigurable architecture for host-based gigabit-rate packet processing","authors":"P. Bellows, J. Flidr, T. Lehman, Brian Schott, K. Underwood","doi":"10.1109/FPGA.2002.1106667","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106667","url":null,"abstract":"One of the fundamental challenges for modern high-performance network interfaces is the processing capabilities required to process packets at high speeds. Simply transmitting or receiving data at gigabit speeds fully utilizes the CPU on a standard workstation. Any processing that must be done to the data, whether at the application layer or the network layer, decreases the achievable throughput. This paper presents an architecture for offloading a significant portion of the network, processing from the host CPU onto the network interface. A prototype, called the GRIP (Gigabit Rate IPSec) card, has been constructed based on an FPGA coupled with a commodity Gigabit Ethernet MAC. Experimental results based on the prototype are presented and analyzed. In addition, a second generation design is presented in the context of lessons learned from the prototype.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130238530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Fast area estimation to support compiler optimizations in FPGA-based reconfigurable systems 在基于fpga的可重构系统中支持编译器优化的快速区域估计
D. Kulkarni, W. Najjar, R. Rinker, F. Kurdahi
{"title":"Fast area estimation to support compiler optimizations in FPGA-based reconfigurable systems","authors":"D. Kulkarni, W. Najjar, R. Rinker, F. Kurdahi","doi":"10.1109/FPGA.2002.1106678","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106678","url":null,"abstract":"Several projects have developed compiler tools that translate high-level languages down to hardware description languages for mapping onto FPGA-based reconfigurable computers. These compiler tools can apply extensive transformations that exploit the parallelism inherent in the computations. However, the transformations can have a major impact on the chip area (number of logic blocks) used on the FPGA. It is imperative therefore that the compiler user be provided with feedback indicating how much space is being used. In this paper we present a fast compile-time area estimation technique to guide the compiler optimizations. Experimental results show that our technique achieves an accuracy within 2.5% for small image-processing operators, and within 5.0% for larger benchmarks, as compared to the usual post-compilation synthesis tool estimations. The estimation time is in the order of milliseconds as compared to several minutes for a synthesis tool.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132386525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
Precis: a design-time precision analysis tool Precis:设计时精度分析工具
Mark L. Chang, S. Hauck
{"title":"Precis: a design-time precision analysis tool","authors":"Mark L. Chang, S. Hauck","doi":"10.1109/FPGA.2002.1106677","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106677","url":null,"abstract":"Currently, few tools exist to aid the FPGA developer in translating an algorithm designed for a general-purpose-processor into one that is precision-optimized for FPGAs. This task requires extensive knowledge of both the algorithm and the target hardware. We present a design-time tool, Precis, which assists the developer in analyzing the precision requirements of algorithms specified in MATLAB. Through the combined use of simulation, user input, and program analysis, we demonstrate a methodology for precision analysis that can aid the developer in focusing their manual precision optimization efforts.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116947713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Customising floating-point designs 定制浮点设计
A. A. Gaffar, W. Luk, P. Cheung, N. Shirazi
{"title":"Customising floating-point designs","authors":"A. A. Gaffar, W. Luk, P. Cheung, N. Shirazi","doi":"10.1109/FPGA.2002.1106698","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106698","url":null,"abstract":"This paper describes a method for customising the representation of floating-point numbers that exploits the flexibility of reconfigurable hardware. The method determines the appropriate size of mantissa and exponent for each operation in a design, so that a cost function with a given error specification for the output relative to a reference representation can be satisfied. Currently our tool, which adopts an iterative implementation of this method, supports single- or double-precision floating-point representation as the reference representation. It produces customised floating-point formats with arbitrary-sized mantissa and exponent. Results show that, for calculations involving large dynamic ranges, our method can achieve significant hardware reduction and speed improvement with respect to a design adopting the reference representation.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126943371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Tabu search with intensification strategy for functional partitioning in hardware-software codesign 基于强化策略的禁忌搜索在软硬件协同设计中的功能划分
T. Wiangtong, P. Cheung, W. Luk
{"title":"Tabu search with intensification strategy for functional partitioning in hardware-software codesign","authors":"T. Wiangtong, P. Cheung, W. Luk","doi":"10.1109/FPGA.2002.1106691","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106691","url":null,"abstract":"This paper presents tabu search (TS) method with intensification strategy for hardware-software partitioning. The algorithm operates on functional blocks for designs represented as directed acyclic graphs (DAG), with the objective of minimising processing time under various hardware area constraints. Results are compared to two other heuristic search algorithms: genetic algorithm (GA) and simulated annealing (SA). The comparison involves a scheduling model based on list scheduling for calculating processing time used as a system cost, assuming that shared resource conflicts do not occur. The results show that TS, which rarely appears for solving this kind of problem, is superior to SA and GA in terms of both search time and the quality of solutions. In addition, we have implemented intensification strategy in TS called penalty reward, which can further improve the quality of results.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128699516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Hardware-assisted fast routing 硬件辅助的快速路由
A. DeHon, Randy Huang, J. Wawrzynek
{"title":"Hardware-assisted fast routing","authors":"A. DeHon, Randy Huang, J. Wawrzynek","doi":"10.1109/FPGA.2002.1106675","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106675","url":null,"abstract":"To fully realize the benefits of partial and rapid reconfiguration of field-programmable devices, we often need to dynamically schedule computing tasks and generate instance-specific configurations-new graphs which must be routed during program execution. Consequently, route time can be a significant overhead cost reducing the achievable net benefits of dynamic configuration generation. BY adding hardware to accelerate routing, we show that it is possible to compute routes in one thousandth the time of a traditional, software router and achieve routes that are within 5% of the state-of-the-art offline routing algorithms for a sample set of application netlists and within 25% for a set of difficult synthetic benchmarks. We further outline how strategic use of parallelism can allow the total route time to scale substantially less than linearly in graph size. We detail the source of the benefits in our approach and survey a range of options for hardware assistance that van, from a speedup of over 10/spl times/ with modest hardware overhead to speedups in excess of 1000/spl times/.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114344464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
Design and analysis of a layer seven network processor accelerator using reconfigurable logic 采用可重构逻辑的七层网络处理器加速器的设计与分析
G. Memik, S. Memik, W. Mangione-Smith
{"title":"Design and analysis of a layer seven network processor accelerator using reconfigurable logic","authors":"G. Memik, S. Memik, W. Mangione-Smith","doi":"10.1109/FPGA.2002.1106668","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106668","url":null,"abstract":"In this paper, we present an accelerator that is designed to improve performance of network processing applications, particularly layer seven networking applications. The accelerator can easily be integrated in Network Processors. We present the design details of two different FPGA implementations: a design where each task is implemented in the accelerator and another one where the accelerator must be partially reconfigured for different tasks. We also present novel algorithms for important tasks such as tree lookup and pattern matching that utilize the accelerator. We show that the accelerator improves the overall execution time by as much as 20-times for these tasks. We show that the accelerator can improve the execution time of a representative layer seven application by an order of magnitude. Finally, we discuss the effects of reconfiguration time and frequency over the performance of the accelerator.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123771728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信