Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines最新文献

筛选
英文 中文
Mapping multi-mode circuits to LUT-based FPGA using embedded MUXes 使用嵌入式mux将多模电路映射到基于lut的FPGA
T. Courtney, R. Turner, Roger Francis Woods
{"title":"Mapping multi-mode circuits to LUT-based FPGA using embedded MUXes","authors":"T. Courtney, R. Turner, Roger Francis Woods","doi":"10.1109/FPGA.2002.1106699","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106699","url":null,"abstract":"For some systems, a general-purpose FPGA solution tends to be large and slow. A reconfigurable solution is smaller and faster but has a delay associated with the reconfiguration. In this paper, embedded MUXes are used to achieve the performance of reconfiguration without the time penalty. For a CRC circuit an area reduction of 93% compared to a general-purpose solution and a reduction of 17-34% compared to similar software compiled systems is achieved.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125345341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Coarse-grain pipelining on multiple FPGA architectures 多FPGA架构上的粗粒度流水线
H. Ziegler, Byoungro So, Mary W. Hall, P. Diniz
{"title":"Coarse-grain pipelining on multiple FPGA architectures","authors":"H. Ziegler, Byoungro So, Mary W. Hall, P. Diniz","doi":"10.1109/FPGA.2002.1106663","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106663","url":null,"abstract":"Reconfigurable systems, and in particular, FPGA-based custom computing machines, offer a unique opportunity to define application-specific architectures. These architectures offer performance advantages for application domains such as image processing, where the use of customized pipelines exploits the inherent coarse-grain parallelism. In this paper we describe a set of program analyses and an implementation that map a sequential and un-annotated C program into a pipelined implementation running on a set of FPGAs, each with multiple external memories. Based on well-known parallel computing analysis techniques, our algorithms perform unrolling for operator parallelization, reuse and data layout for memory parallelization and precise communication analysis. We extend these techniques for FPGA-based systems to automatically partition the application data and computation into custom pipeline stages, taking into account the available FPGA and interconnect resources. We illustrate the analysis components by way of an example, a machine vision program. We present the algorithm results, derived with minimal manual intervention, which demonstrate the potential of this approach for automatically deriving pipelined designs from high-level sequential specifications.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129461865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Using on-chip configurable logic to reduce embedded system software energy 采用片上可配置逻辑,降低嵌入式系统软件能耗
G. Stitt, Brian Grattan, J. Villarreal, F. Vahid
{"title":"Using on-chip configurable logic to reduce embedded system software energy","authors":"G. Stitt, Brian Grattan, J. Villarreal, F. Vahid","doi":"10.1109/FPGA.2002.1106669","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106669","url":null,"abstract":"We examine the energy savings possible by re-mapping critical software loops from a microprocessor to configurable logic appearing on the same-chip in commodity chips now commercially available. That logic is typically intended to implement peripherals and coprocessors without increasing chip count-but we show that reduced software energy is an additional benefit, making such chips even more useful. We find critical software loops and re-implement them in the configurable logic such that a repeating software task completes sooner, allowing us to put the system in a low-power state for longer periods, thus reducing energy. We use simulations and estimations for a hypothetical device having a 32-bit MIPS processor plus configurable logic, yielding energy savings of 25%, increasing to 39% assuming voltage scaling. We physically measured several examples running on two commercial single-chip devices having an 8-bit 8051 microprocessor plus configurable logic and a 32-bit ARM microprocessor with configurable logic, with energy savings of 71% and 53% respectively, increasing to an estimated 89% and 75% assuming voltage scaling.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129799462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Custom computing machines for the set covering problem 为集合覆盖问题定制计算机器
Christian Plessl, M. Platzner
{"title":"Custom computing machines for the set covering problem","authors":"Christian Plessl, M. Platzner","doi":"10.1109/FPGA.2002.1106671","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106671","url":null,"abstract":"We present instance-specific custom computing machines for the set covering problem. Four accelerator architectures are developed that implement branch & bound in 3-valued logic and many of the deduction techniques found in software solvers. We use set covering benchmarks from two-level logic minimization and Steiner triple systems to derive and discuss experimental results. The resulting raw speedups are in the order of four magnitudes on average. Finally, we propose a hybrid solver architecture that combines the raw speed of instance-specific reconfigurable hardware with flexible bounding schemes implemented in software.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128721165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
The design of the Amalgam reconfigurable cluster 汞合金可重构集群的设计
Joshua D. Walstrom, Jeffrey J. Cook, Derek B. Gottlieb, Steve Ferrera, Chi-Wei Wang, N. Carter
{"title":"The design of the Amalgam reconfigurable cluster","authors":"Joshua D. Walstrom, Jeffrey J. Cook, Derek B. Gottlieb, Steve Ferrera, Chi-Wei Wang, N. Carter","doi":"10.1109/FPGA.2002.1106696","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106696","url":null,"abstract":"Amalgam is a novel architecture for multifunction embedded systems. It integrates multiple reconfigurable and programmable processing resources (known as clusters) to achieve high-performance with low design effort on a variety of multimedia applications. The reconfigurable cluster (RClust) enables Amalgam to exploit the natural parallelism and operator granularities of a target application. The RClust contains a ring of reconfigurable logic interleaved with a banked register file to support Amalgam's register-based inter-cluster communication mechanism. This low-latency mechanism allows the RClust to coordinate with a programmable cluster (PClust) as a special purpose junctional unit implementing small custom operations. The relatively large size of the cluster, however, allows it to implement larger, more independent computational kernels. In this extended abstract, we describe the initial design of the RClust and present results from mapping several benchmarks to Amalgam architectures with and without RClust elements.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124618208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A scalable FPGA-based custom computing machine for a medical image processing 一种可扩展的基于fpga的医学图像处理定制计算机
T. Yokota, Masamichi Nagafuchi, Y. Mekada, T. Yoshinaga, K. Ootsu, T. Baba
{"title":"A scalable FPGA-based custom computing machine for a medical image processing","authors":"T. Yokota, Masamichi Nagafuchi, Y. Mekada, T. Yoshinaga, K. Ootsu, T. Baba","doi":"10.1109/FPGA.2002.1106695","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106695","url":null,"abstract":"Concentration index filter is a kind of spatial filters of images, and its typical application is diagnosis from medical images. This paper presents a dedicated computing engine for concentration index filtering. Original algorithm is modified to extract full parallelism and data width is optimized for maximizing clock speed and minimizing hardware scale. Evaluation results reveal that the system runs 100 times faster than current workstation and enables real-time diagnosis.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124412020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Accelerating radiosity calculations using reconfigurable platforms 使用可重构平台加速辐射计算
Henry Styles, W. Luk
{"title":"Accelerating radiosity calculations using reconfigurable platforms","authors":"Henry Styles, W. Luk","doi":"10.1109/FPGA.2002.1106684","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106684","url":null,"abstract":"We describe a feasibility study into accelerating computer graphics radiosity calculations using reconfigurable hardware. A modular hardware/software codesign framework has been developed for experimenting with hardware acceleration of a time consuming step: formfactor determination. We describe a parameterised hardware design pattern, captured in the Handel-C language, which enables rapid exploration of the area/throughput design space for simple pipelines. Using this pattern we determine speedup and resource usage on a range of Xilinx Virtex FPGA devices, and examine future trends in performance. As a sample of these results we demonstrate a 7.6 times speed-up over a 1.4GHz Athlon PC using a Xilinx XCV2000E and, based on place and route reports, estimate 31 times speed-up using a Xilinx XC2V8000.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126458522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Using floating-point arithmetic on FPGAs to accelerate scientific N-Body simulations 在fpga上使用浮点运算加速科学n体仿真
G. Lienhart, A. Kugel, R. Männer
{"title":"Using floating-point arithmetic on FPGAs to accelerate scientific N-Body simulations","authors":"G. Lienhart, A. Kugel, R. Männer","doi":"10.1109/FPGA.2002.1106673","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106673","url":null,"abstract":"This paper investigates the usage of floating-point arithmetic on FPGAs for N-Body simulation in natural science. The common aspect of these applications is the simple computing structure where forces between a particle and its surrounding particles are summed up. The role of reduced precision arithmetic is discussed, and our implementation of a floating-point arithmetic library with parameterized operators is presented. On the base of this library, implementation strategies of complex arithmetic units are discussed. Finally the realization of a fully pipelined pressure force calculation unit consisting of 60 floating-point operators with a resulting performance of 3.9 Gflops on an off the shelf FPGA is presented.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134636692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 99
Hyperspectral image compression on reconfigurable platforms 可重构平台上的高光谱图像压缩
T. W. Fry, S. Hauck
{"title":"Hyperspectral image compression on reconfigurable platforms","authors":"T. W. Fry, S. Hauck","doi":"10.1109/FPGA.2002.1106679","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106679","url":null,"abstract":"In this paper we present an implementation of the image compression routine SPIHT in reconfigurable logic. A discussion on why adaptive logic is required, as opposed to an ASIC, is provided along with background material on the image compression algorithm. We analyzed several discrete wavelet transform architectures and selected the folded DWT design. In addition we provide a study on what storage elements are required for each wavelet coefficient. The paper uses a modification to the original SPIHT algorithm needed to parallelize the computation. The architecture of the SPIHT engine is based upon fixed-order SPIHT, developed specifically for use within adaptive hardware. For an N /spl times/ N image fixed-order SPIHT may be calculated in N/sup 2//4 cycles. Square images which are powers of 2 up to 1024 /spl times/ 1024 are supported by the architecture. Our system was developed on an Annapolis Microsystems WildStar board populated with Xilinx Virtex-E parts.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132764803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
Single-chip gigabit mixed-version IP router on Virtex-II Pro Virtex-II Pro上的单芯片千兆混合版本IP路由器
G. Brebner
{"title":"Single-chip gigabit mixed-version IP router on Virtex-II Pro","authors":"G. Brebner","doi":"10.1109/FPGA.2002.1106659","DOIUrl":"https://doi.org/10.1109/FPGA.2002.1106659","url":null,"abstract":"This paper concerns novel single-chip system architecture options, based on the Xilinx Virtex-II Pro part, which includes up to four PowerPC cores and was launched in Spring 2002. The research described here was carried out pre-launch (i.e., prior to availability of real parts), so the paper focuses on initial architectural experiments based on simulation. The application is a Mixed-version IP Router, named MIR, servicing gigabit ethernet ports. This would be of use to organizations with several gigabit ethernets, with a mixture of IPv4 and IPv6 hosts and routers attached directly to the networks. A particular benefit of a programmable approach based on Virtex-II Pro is that the router's functions can evolve smoothly, maintaining router performance as the organization migrates from IPv4 to IPv6 internally, and also as the Internet migrates externally. The basic aim is to carry out more frequent, and less control intensive, functions in logic, and other functions in the processor. Two prototypes are described here. Both support four ethernet ports, but the designs are scalable upwards. The second one, the more ambitious of the two, instantiates a configuration appropriate when the bulk of the incoming packets are IPv4. Such packets are processed and switched entirely by logic, with no internal copying of packets between buffers and virtually no delay between packet receipt and onward forwarding. This involves a specially-tailored internal interconnection network between the four ports, and also processing performed in parallel with packet receipt, i.e. multi-threading in logic. IPv6 packets, or some rare IPv4 cases, are passed to a PowerPC core for processing. In essence, the PowerPC acts as a slave to the logic, rather than the more common opposite master-slave relationship.","PeriodicalId":272235,"journal":{"name":"Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115344130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信