2008 International Conference on Field Programmable Logic and Applications最新文献

筛选
英文 中文
Loop unrolling and shifting for reconfigurable architectures 可重构体系结构的循环展开和移位
2008 International Conference on Field Programmable Logic and Applications Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4629926
O. S. Dragomir, T. Stefanov, K. Bertels
{"title":"Loop unrolling and shifting for reconfigurable architectures","authors":"O. S. Dragomir, T. Stefanov, K. Bertels","doi":"10.1109/FPL.2008.4629926","DOIUrl":"https://doi.org/10.1109/FPL.2008.4629926","url":null,"abstract":"Loops are an important source of optimization. In this paper, we propose a new technique for optimizing loops that contain kernels mapped on a reconfigurable fabric. We assume the Molen machine organization and programming paradigm as our framework. The method we propose extends our previous work on loop unrolling for reconfigurable architectures by combining unrolling with shifting to relocate the function calls contained in the loop body such that in every iteration of the transformed loop, software functions (running on GPP) execute in parallel with multiple instances of the kernel (running on FPGA). The algorithm is based on profiling information about the kernelpsilas execution times on GPP and FPGA, memory transfers and area utilization. In the experimental part, we apply this method to a loop nest extracted from MPEG2 encoder containing the DCT kernel. The achieved speedup is 19.65x over software execution and 1.8x over loop unrolling.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124920846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A reconfigurable accelerator for quantum computations 用于量子计算的可重构加速器
2008 International Conference on Field Programmable Logic and Applications Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4630024
M. Zampetakis, V. Samoladas, A. Dollas
{"title":"A reconfigurable accelerator for quantum computations","authors":"M. Zampetakis, V. Samoladas, A. Dollas","doi":"10.1109/FPL.2008.4630024","DOIUrl":"https://doi.org/10.1109/FPL.2008.4630024","url":null,"abstract":"This paper presents a new architecture to accelerate quantum computations, using reconfigurable computing structures. It was designed post place and route, validated with standard benchmarks, and its performance has been evaluated vs. a high end computer running the same benchmarks with highly optimized code. The acceleration of arbitrary quantum circuits is very promising because it extends the kind of quantum computing problems that can be solved with reconfigurable architectures.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125952276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Separable implementation of the second order Volterra filter (SOVF) in Xilinx Virtex-E FPGA 二阶Volterra滤波器(SOVF)在Xilinx Virtex-E FPGA上的可分离实现
2008 International Conference on Field Programmable Logic and Applications Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4630001
M. Al-Mistarihi
{"title":"Separable implementation of the second order Volterra filter (SOVF) in Xilinx Virtex-E FPGA","authors":"M. Al-Mistarihi","doi":"10.1109/FPL.2008.4630001","DOIUrl":"https://doi.org/10.1109/FPL.2008.4630001","url":null,"abstract":"Post-beamforming second order Volterra filter (SOVF) was previously introduced for decomposing the pulse echo ultrasonic radio-frequency (RF) signal into its linear and quadratic components. Using singular value decomposition (SVD), an optimal minimum-norm least squares algorithm for deriving the coefficients of the linear and quadratic kernels of the SOVF was developed and verified. The ldquoseparablerdquo implementation algorithm of a SOVF based on the eigenvalue decomposition (EVD) of the quadratic kernel was introduced and verified. In this paper, the ldquoSeparablerdquo version of a second order Volterra filter is implemented in Xilinx Virtex-E FPGA. Parallel operation, efficient use of instructions per task, and data streaming capability of FPGA are identified. This implementation should allow for real-time implementation of quadratic filtering on commercial ultrasound scanners.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122415990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A low overhead fault tolerant FPGA with new connection box 具有新型连接盒的低开销容错FPGA
2008 International Conference on Field Programmable Logic and Applications Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4630029
F. Wong, Yajun Ha
{"title":"A low overhead fault tolerant FPGA with new connection box","authors":"F. Wong, Yajun Ha","doi":"10.1109/FPL.2008.4630029","DOIUrl":"https://doi.org/10.1109/FPL.2008.4630029","url":null,"abstract":"With the increasing process variations in advanced semiconductor technologies, fault tolerance has become one of several essential issues in building Field Programmable Gate Arrays (FPGAs). Unfortunately, there has been much less fault tolerance work previously done on FPGA interconnects, which take up to 90% of an FPGA device, than on its logic blocks. In view of this, we develop a low overhead connection block architecture, which improves the fault tolerance of FPGA interconnects. By testing 10 MCNC benchmarks on the new architecture, FPGA fault tolerance reaches levels comparable to adding 2 extra wire tracks per channel, with the average timing overhead below 2.5% and the area overheads of only 2.5% - 4%.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125849702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Increasing the level of abstraction in FPGA-based designs 提高基于fpga的设计的抽象水平
2008 International Conference on Field Programmable Logic and Applications Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4629899
M. Danek, J. Kadlec, R. Bartosinski, L. Kohout
{"title":"Increasing the level of abstraction in FPGA-based designs","authors":"M. Danek, J. Kadlec, R. Bartosinski, L. Kohout","doi":"10.1109/FPL.2008.4629899","DOIUrl":"https://doi.org/10.1109/FPL.2008.4629899","url":null,"abstract":"Traditional design techniques for FPGAs are based on using hardware description languages, with functional and post-place-and-route simulation as a means to check design correctness and remove detected errors. With large complexity of things to be designed it is necessary to introduce new design approaches that will increase the level of abstraction while maintaining the necessary efficiency of a computation performed in hardware that we are used to today. This paper presents one such methodology that builds upon existing research in multithreading, object composability and encapsulation, partial runtime reconfiguration, and self adaptation. The methodology is based on currently available FPGA design tools. The efficiency of the methodology is evaluated on basic vector and matrix operations.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126405188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
A dedicated DMA logic addressing a time multiplexed memory to reduce the effects of the system bus bottleneck 一种专用的DMA逻辑寻址时间复用存储器,以减少系统总线瓶颈的影响
2008 International Conference on Field Programmable Logic and Applications Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4629990
C. Brunelli, F. Garzia, Carmelo Giliberto, J. Nurmi
{"title":"A dedicated DMA logic addressing a time multiplexed memory to reduce the effects of the system bus bottleneck","authors":"C. Brunelli, F. Garzia, Carmelo Giliberto, J. Nurmi","doi":"10.1109/FPL.2008.4629990","DOIUrl":"https://doi.org/10.1109/FPL.2008.4629990","url":null,"abstract":"A very common problem which affects the performance of bus-based computing systems arises from the fact that the bus is a common resource which needs to be shared between a number of master devices. The common resource contention forces to stall temporarily the execution of one or more of the bus masters, slowing down the execution. Moreover, the width of the bus is usually relatively small, forcing the bus master to perform several bus cycles in order to transfer a data block from the main memory to a peripheral (or to a processing element), and the other way around. The combination of these factors leads to problems and inefficiencies which designers need to solve. In this paper we present a dedicated hardware used to allow an external accelerator to access the system memory independently from the main microprocessor. The proposed device is able to exchange data with the memory in a DMA-like fashion, to generate properly memory addresses in order to access it in an efficient way. Results show that using such a solution it is possible to reach a considerable speed-up in the execution of a given algorithm.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130974496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Compiler generated systolic arrays for wavefront algorithm acceleration on FPGAs 编译器为fpga上的波前算法加速生成收缩阵列
2008 International Conference on Field Programmable Logic and Applications Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4630032
B. Buyukkurt, W. Najjar
{"title":"Compiler generated systolic arrays for wavefront algorithm acceleration on FPGAs","authors":"B. Buyukkurt, W. Najjar","doi":"10.1109/FPL.2008.4630032","DOIUrl":"https://doi.org/10.1109/FPL.2008.4630032","url":null,"abstract":"Wavefront algorithms, such as the Smith-Waterman algorithm, are commonly used in bioinformatics for exact local and global sequence alignment. These algorithms are highly computationally intensive and are therefore excellent candidates for FPGA-based code acceleration. However, there is no standard form of these algorithms, they are used in a wide variety of situations with various constraints. It is therefore not practical to have a standard kernel that can be mapped to an FPGA, hence the importance of being able to compile such codes from a high level language. ROCCC is a C to VHDL compiler, which optimizes and parallelizes the most frequently executed kernel loops in applications such as in multimedia, scientific and high-performance computing. In this paper we describe the transformations performed by ROCCC, which transformed the kernel of the Smith-Waterman algorithm into a hardware systolic array that is mapped onto the FPGA on the SGI Altix RASC blade. We report a throughput increase by over 3,000times over a 2.8 GHz Xeon.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122349286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
A new methodology for debugging and validation of soft cores 一种新的软核调试与验证方法
2008 International Conference on Field Programmable Logic and Applications Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4630006
C. Hochberger, A. Weiss
{"title":"A new methodology for debugging and validation of soft cores","authors":"C. Hochberger, A. Weiss","doi":"10.1109/FPL.2008.4630006","DOIUrl":"https://doi.org/10.1109/FPL.2008.4630006","url":null,"abstract":"The amount of time and resources that have to be spent on debugging of embedded cores continuously increases. Approaches valid 10 years ago can no longer be used due to the variety and complexity of peripheral components of SoC solutions that even might consist of multiple heterogeneous cores. In this contribution we show how debugging and tracing of embedded processor cores can be enhanced by use of an externally synchronized cpu core.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124016474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A comparison of embedded reconfigurable video-processing architectures 嵌入式可重构视频处理体系结构的比较
2008 International Conference on Field Programmable Logic and Applications Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4630015
Christopher Claus, W. Stechele, Matthias Kovatsch, Josef Angermeier, J. Teich
{"title":"A comparison of embedded reconfigurable video-processing architectures","authors":"Christopher Claus, W. Stechele, Matthias Kovatsch, Josef Angermeier, J. Teich","doi":"10.1109/FPL.2008.4630015","DOIUrl":"https://doi.org/10.1109/FPL.2008.4630015","url":null,"abstract":"Using field programmable gate arrays (FPGAs) as accelerators for image or video processing operations and algorithms has gained increasing attention over the last few years. One reason for that is FPGAs are able to exploit both temporal and spatial parallelism. In this paper two platforms for FPGA-based real-time image and video processing are presented and compared against each other. With both of these platforms it is possible to update the physical resources during run-time by exploiting the dynamic partial reconfiguration capabilities of Xilinx Virtex FPGAs. The analysis of both platforms with respect to their benefits and draw-backs has led to the concept of an optimal FPGA-based dynamically and partially reconfigurable platform for real-time video and image processing.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124188351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Direct sigma-delta modulated signal processing in FPGA 直接σ - δ调制信号的FPGA处理
2008 International Conference on Field Programmable Logic and Applications Pub Date : 2008-09-23 DOI: 10.1109/FPL.2008.4629987
Chiu-Wah Ng, N. Wong, Hayden Kwok-Hay So, T. Ng
{"title":"Direct sigma-delta modulated signal processing in FPGA","authors":"Chiu-Wah Ng, N. Wong, Hayden Kwok-Hay So, T. Ng","doi":"10.1109/FPL.2008.4629987","DOIUrl":"https://doi.org/10.1109/FPL.2008.4629987","url":null,"abstract":"The effectiveness of implementing bit-stream signal processing (BSSP) multiplier circuits in FPGAs, in terms of hardware resources and clock frequency, is presented. In particular, the result of realizing BSSP multipliers on FPGA architectures that utilize 6-input lookup tables (LUTs) is compared against architectures that utilize 4-input LUTs. It is found that architectures featuring 6-input LUTs suit well in BSSP applications where wide combinatorial paths are common. Furthermore, the performance of a BSSP multiplier is compared against conventional parallel multipliers in terms of LUT resource requirements. For a given resource requirement, it is found that an over-sampling ratio of less than 32 is required for a BSSP multiplier to outperform its parallel counterpart.","PeriodicalId":137963,"journal":{"name":"2008 International Conference on Field Programmable Logic and Applications","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121187596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信