Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays最新文献

筛选
英文 中文
An Automated Design Framework for Floating Point Scientific Algorithms using Field Programmable Gate Arrays (FPGAs) (Abstract Only) 基于现场可编程门阵列(fpga)的浮点科学算法自动化设计框架(仅摘要)
Michaela Amoo, Youngsoo Kim, Vance Alford, Shrikant S. Jadhav, Naser El-Bathy, C. Gloster
{"title":"An Automated Design Framework for Floating Point Scientific Algorithms using Field Programmable Gate Arrays (FPGAs) (Abstract Only)","authors":"Michaela Amoo, Youngsoo Kim, Vance Alford, Shrikant S. Jadhav, Naser El-Bathy, C. Gloster","doi":"10.1145/2684746.2689101","DOIUrl":"https://doi.org/10.1145/2684746.2689101","url":null,"abstract":"This paper presents a reconfigurable computing environment while addressing the problem of porting High Performance Computing (HPC) applications directly to Field Programmable Gate Arrays (FPGAs)-based architectures. The objectives of this research are developing a comprehensive floating point library of essential functions for scientific applications; demonstrate order of magnitude speedup of reconfigurable computing applications, demonstrating the effectiveness of automated design framework for both development and test of scientific algorithms. The developed framework can be reused in various scientific applications which shares kernel functions. The study of this research has identified an exponential function as a kernel for cellular ophthalmoscopy camera processing, traffic monitoring and light wave simulation. The paper demonstrates 30x speedup of these kernels in three algorithms using its novel architecture and its automated toolset. Exponential kernel generation case study and its flexible hardware implementation on an FPGA has been validated onto a Xilinx LX-100 device and the Nallatech H101-PCIXM FPGA board.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132055680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Source-Level Transformations to Improve High-Level Synthesis Debug and Validation on FPGAs 使用源级转换改进fpga的高级综合调试和验证
Joshua S. Monson, B. Hutchings
{"title":"Using Source-Level Transformations to Improve High-Level Synthesis Debug and Validation on FPGAs","authors":"Joshua S. Monson, B. Hutchings","doi":"10.1145/2684746.2689087","DOIUrl":"https://doi.org/10.1145/2684746.2689087","url":null,"abstract":"This paper proposes a method for extending source-level visibility into the RTL of an HLS-generated design using automated source-level transformations. Using our method, source-level visibility can be extended into co-simulation, in-system simulation, and hardware execution of any HLS tool that provides the ability to infer top-level ports. Experimental results show the feasibility of our method in situations where visibility needs to be added without modifying the timing, latency, or throughput of the design.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129716607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Logic Gates in the routing network of FPGAs (Abstract Only) fpga路由网络中的逻辑门(仅摘要)
Elias Vansteenkiste, Berg Severens, D. Stroobandt
{"title":"Logic Gates in the routing network of FPGAs (Abstract Only)","authors":"Elias Vansteenkiste, Berg Severens, D. Stroobandt","doi":"10.1145/2684746.2689098","DOIUrl":"https://doi.org/10.1145/2684746.2689098","url":null,"abstract":"We propose a new kind of FPGA architecture with a routing network that not only provides interconnections between the functional blocks but also performs some logic operation. More specifically we replaced the routing multiplexer node in the conventional architecture with an element that can be used as both AND gate and multiplexer. A conventional routing multiplexer node consists of a multiplexer and a two stage buffer. In our new architecture a NAND gate replaces the first inverter stage of the buffer and two multiplexers half the size of the original multiplexer replace the original multiplexer. The aim of this study is to determine if this kind of architecture is feasible and if it is worth to implement pack, placement and routing tools in the future. We developed a new technology-mapping algorithm and sized the transistors in this new architecture to evaluate the area and delay. Preliminary results indicate that the gain in logic depth and area achieved by mapping to not only LUTs but also to AND gates outweighs the overhead of introducing AND gates in the routing network with a net reduction in area-delay product of 5.6. Designs implemented on the proposed architecture would require 11.2 % more area, but they will have a 14 % decreased logic depth and the architecture has a slightly faster representative critical path. These results are preliminary because the pack, place and route routines are not implemented yet.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130597125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimized Fixed-Point FPGA Implementation of SVPWM for a Two-Level Inverter (Abstract Only) 双电平逆变器SVPWM的定点FPGA优化实现(仅摘要)
D. Mohammadi, S. Ahmed-Zaid, N. Rafla
{"title":"Optimized Fixed-Point FPGA Implementation of SVPWM for a Two-Level Inverter (Abstract Only)","authors":"D. Mohammadi, S. Ahmed-Zaid, N. Rafla","doi":"10.1145/2684746.2689144","DOIUrl":"https://doi.org/10.1145/2684746.2689144","url":null,"abstract":"This paper presents an optimized fixed-point implementation of space-vector pulse-width modulation (SVPWM) for a two-level inverter. Bit-width fixed-point signals as well as circuit area are minimized by meeting the desired design accuracy. Most of the designs currently available are specified in floating-point precision to speed the process of simulating their functionality. However, area-optimized hardware implementation of these algorithms requires fixed-point precision. A generic function is used to formulate the precision required for each signal to get the proper accuracy. A non-convex optimization problem is solved for the number of required bit-widths for the signals. This solution has been simulated and implemented on FPGA to verify the resulting accuracy.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121066466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of a Loeffler DCT using Xilinx Vivado HLS (Abstract Only) 基于Xilinx Vivado HLS的Loeffler DCT设计(仅摘要)
Seung Yeol Baik, S. Jeong, H. Oh
{"title":"Design of a Loeffler DCT using Xilinx Vivado HLS (Abstract Only)","authors":"Seung Yeol Baik, S. Jeong, H. Oh","doi":"10.1145/2684746.2694735","DOIUrl":"https://doi.org/10.1145/2684746.2694735","url":null,"abstract":"Loeffler discrete cosine transform (DCT) algorithm is recognized as the most efficient one because it requires the theoretically least number of multiplications. However, many applications still encounter difficulty in performing the 11 multiplications required by the algorithm to calculate a 1D eight-point DCT. To avoid expensive multipliers in the hardware, we used two design methods, namely, distributed arithmetic (DA) and shift-and-add (SAA) methods, to design the DCT accelerator. The memory bandwidth is 60 bits: 24 bits for reads of the R(red), G(green), and B(blue) data of a pixel and 36 bits for writes of three corresponding 12-bit DCT coefficients. Thus, the 1D eight-point DCT accelerator for each of R, G, and B can have one 12-bit input port and one 12-bit output port so that it can calculate a 2D DCT by row-column decomposition method. The designs are adjusted to produce the same latency and interval. DA seems promising because Loeffler DCT requires only three small tables with four input bits. However, our experiments using Xilinx Vivado HLS show that the SAA design is better than the DA design for the considered applications. Furthermore, simulation results suggest that the optimal accelerator design can be obtained by adjusting the SAA design to the considered applications. The resultant SAA design requires only 13 adders (per color component) and can calculate one DCT coefficient per clock cycle. The precision of the internal hardware has been adjusted, such that the reconstructed images have PSNR values of at least 39.1 dB for all test images (Lenna, Pepper, House, and Cameraman). If a precision of 13bits is allowed, PSNR becomes at least 44.8 dB. Our presentation describes the architecture and operation of the optimized SAA design.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125372408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
InTime: A Machine Learning Approach for Efficient Selection of FPGA CAD Tool Parameters 一种基于机器学习的FPGA CAD工具参数选择方法
Nachiket Kapre, Harnhua Ng, K. Teo, J. Naude
{"title":"InTime: A Machine Learning Approach for Efficient Selection of FPGA CAD Tool Parameters","authors":"Nachiket Kapre, Harnhua Ng, K. Teo, J. Naude","doi":"10.1145/2684746.2689081","DOIUrl":"https://doi.org/10.1145/2684746.2689081","url":null,"abstract":"FPGA CAD tool parameters controlling synthesis optimizations, place and route effort, mapping criteria along with user-supplied physical constraints can affect timing results of the circuit by as much as 70% without any change in original source code. A correct selection of these parameters across a diverse set of benchmarks with varying characteristics and design goals is challenging. The sheer number of parameters and option values that can be selected is large (thousands of combinations for modern CAD tools) with often conflicting interactions. In this paper, we present InTime, a machine-learning approach supported by a cloud-based (or cluster-based) compilation infrastructure for automating the selection of these parameters effectively to minimize timing costs. InTime builds a database of results from a series of preliminary runs based on canned configurations of CAD options. It then learns from these runs to predict the next series of CAD tool options to improve timing results. Towards the end, we rely on a limited degree of statistical sampling of certain options like placer and synthesis seeds to further tighten results. Using our approach, we show 70% reduction in final timing results across industrial benchmark problems for the Altera CAD flow. This is 30% better than vendor-supplied design space exploration tools that attempts a similar optimization using canned heuristics.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132961097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays 2015 ACM/SIGDA现场可编程门阵列国际研讨会论文集
{"title":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","authors":"","doi":"10.1145/2684746","DOIUrl":"https://doi.org/10.1145/2684746","url":null,"abstract":"","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130699748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信