2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)最新文献_第8页

Identifying homogenous reconfigurable regions in heterogeneous FPGAs for module relocation 在异构fpga中识别同质可重构区域用于模块重定位

2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14) Pub Date : 2014-12-01 DOI: 10.1109/ReConFig.2014.7032533

Rico Backasch, G. Hempel, Stefan Werner, Sven Groppe, Thilo Pionteck

引用次数: 12

Fast and generic hardware architecture for stereo block matching applications on embedded systems 嵌入式系统中立体块匹配应用的快速通用硬件架构

2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14) Pub Date : 2014-12-01 DOI: 10.1109/ReConFig.2014.7032518

K. Häublein, M. Reichenbach, D. Fey

{"title":"Fast and generic hardware architecture for stereo block matching applications on embedded systems","authors":"K. Häublein, M. Reichenbach, D. Fey","doi":"10.1109/ReConFig.2014.7032518","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032518","url":null,"abstract":"Even with the tremendous performance increase of microprocessor architectures in recent years, real time capturing and computing of stereo images remains a challenging task, particularly in the field of embedded image processing. The stereo block matching technique allows hardware designers to parallelize the process of depth map calculation. Additionally, for smart camera designers it is also crucial to adapt hardware architectures for different FPGA platforms, sensor properties, throughput, and accuracy. However, most application specific implementations of this technique are usually fixed to a single camera set up to achieve high frame rates, but lack in flexibility of these properties. A general approach for a stereo block matching model, which is also able to process high resolution images in real time, is still missing. Therefore, we present a new generic VHDL template for fast window based stereo block matching correlation. It is fully scalable in functional parameters like image size, window size, and disparity range. Its streaming character even allows to compute HD images in real time. Also an interface for a flexible PE structure is provided. This enables the hardware designer to apply a custom made cost function, which performs a correlation between the target windows and the reference window. The developer is also able to adapt the model to the available sensor speed and FPGA resource limitations. These features should help designers to find the right trade-off between depth map quality and available hardware resources.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116012734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Memory optimisation for hardware induction of axis-parallel decision tree 轴并行决策树硬件归纳的内存优化

2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14) Pub Date : 2014-12-01 DOI: 10.1109/ReConFig.2014.7032538

C. Cheng, C. Bouganis

{"title":"Memory optimisation for hardware induction of axis-parallel decision tree","authors":"C. Cheng, C. Bouganis","doi":"10.1109/ReConFig.2014.7032538","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032538","url":null,"abstract":"In data mining and machine learning applications, the Decision Tree classifier is widely used as a supervised learning method not only in the form of a stand alone model but also as a part of an ensemble learning technique (i.e. Random Forest). The induction of Decision Trees (i.e. training stage) involves intense memory communication and inherent parallel processing, making an FPGA device a promising platform for accelerating the training process due to high memory bandwidth enabled by the embedded memory blocks in the device. However, peak memory bandwidth is reached when all the channels of the block RAMs on the FPGA are free for concurrent communication, whereas to accommodate large data sets several block RAMs are often combined together making unavailable a number of memory channels. Therefore, efficient use of the embedded memory is critical not only for allowing larger training dataset to be processed on an FPGA but also for making available as many memory channels as possible to the rest of the system. In this work, a data compression scheme is proposed for the training data stored in the embedded memory for improving the memory utilisation of the device, targeting specifically the axis-parallel decision tree classifier. The proposed scheme takes advantage of the nature of the problem of the decision tree induction and improves the memory efficiency of the system without any compromise on the performance of the classifier. It is demonstrated that the scheme can reduce the memory usage by up to 66% for the training datasets under investigation without compromise in training accuracy, while a 28% reduction in training time is achieved due to extra processing power enabled by the additional memory bandwidth.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117080516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Can high-level synthesis compete against a hand-written code in the cryptographic domain? A case study 在密码学领域，高级合成能与手写代码竞争吗?案例研究

2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14) Pub Date : 2014-12-01 DOI: 10.1109/ReConFig.2014.7032504

Ekawat Homsirikamol, K. Gaj

引用次数: 45

Efficient FPGA-based implementation of a CAZAC sequence generator for 3GPP LTE 基于fpga的3GPP LTE CAZAC序列发生器高效实现

2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14) Pub Date : 2014-12-01 DOI: 10.1109/ReConFig.2014.7032513

F. A. P. Figueiredo, Fabiano S. Mathilde, Fabbryccio A. C. M. Cardoso, Rafael M. Vilela, J. P. Miranda

引用次数: 5

PoC-align: An open-source alignment accelerator using FPGAs PoC-align:使用fpga的开源对齐加速器

2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14) Pub Date : 2014-12-01 DOI: 10.1109/ReConFig.2014.7032548

Thomas B. Preußer, Oliver Knodel, R. Spallek

{"title":"PoC-align: An open-source alignment accelerator using FPGAs","authors":"Thomas B. Preußer, Oliver Knodel, R. Spallek","doi":"10.1109/ReConFig.2014.7032548","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032548","url":null,"abstract":"The mapping of reads, i.e. short DNA base pair strings, to large genome databases has become a critical operation for genetic analysis and diagnosis. The underlying alignment operation essentially is a string search tolerating some character mismatches and possibly character deletions or insertions with respect to a reference genome. Its output comprises the locations within the reference that are likely to correspond to the mapped DNA snippet. This paper describes PoC-Align, an alignment infrastructure using FPGA accelerators. It is an extension of our preceding FPGA aligner [1], which has been enhanced to tolerate alignment gaps (insertions and deletions) and to be more customizable though generic parameters. In addition to the descriptions of the implementation of these extensions, we also name the mainly software-carried enhancements, such as the support of mapping paired-end reads, that are implemented on top of the FPGA accelerator. Providing a thorough overview on the complete infrastructure, we aim at advertising the disclosure of the sources of our solution and hope to encourage other groups to use and extend this platform.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128662241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A unified OpenCL-flavor programming model with scalable hybrid hardware platform on FPGAs 基于fpga的可扩展混合硬件平台的统一opencl风格编程模型

2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14) Pub Date : 2014-12-01 DOI: 10.1109/ReConFig.2014.7032563

Hongyuan Ding, Miaoqing Huang

{"title":"A unified OpenCL-flavor programming model with scalable hybrid hardware platform on FPGAs","authors":"Hongyuan Ding, Miaoqing Huang","doi":"10.1109/ReConFig.2014.7032563","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032563","url":null,"abstract":"Hardware accelerators are capable of achieving significant performance improvement. However, designing hardware accelerators lacks the flexibility and the productivity. Combining hardware accelerators with multiprocessor system-on-chip (MPSoC) is an alternative way to balance the flexibility, the productivity, and the performance. In this work, we present a unified hybrid OpenCL-flavor (HOpenCL) parallel programming model on MPSoC supporting both hardware and software kernels. By integrating the HOpenCL hardware IPs and software libraries, the same kernel function can execute as either hardware kernels on the dedicated hardware accelerators or software kernels on the general-purpose processors. Using the automatic design flow, the corresponding hybrid hardware platform is generated along with the executable. We use the matrix multiplication of 512×512 to examine the potential of our hybrid system in terms of performance, scalability, and productivity. The results show that hardware kernels reach more than 10 times speedup compared with the software kernels. Our prototype platform also demonstrates a good performance scalability when the number of group computation units (GCUs) increases from 1 to 6 until it becomes a memory bound problem. Compared with the hard ARM core on the Zynq 7045 device, we find that the performance of one ARM core is equivalent to 2 or 3 GCUs with software kernel implementations. On the other hand, a single GCU with hardware kernel implementation is 5 times faster than the ARM core.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123735720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

A hardware generator for factor graph applications 一个用于因子图应用程序的硬件生成器

2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14) Pub Date : 2014-06-08 DOI: 10.1109/ReConFig.2014.7032490

James Demma, P. Athanas

{"title":"A hardware generator for factor graph applications","authors":"James Demma, P. Athanas","doi":"10.1109/ReConFig.2014.7032490","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032490","url":null,"abstract":"A Factor Graph (FG-http://en.wikipedia.org/wiki/Factor_graph) is a structure used to And solutions to problems that can be represented as a Probabilistic Graphical Model (PGM). They consist of interconnected variable nodes and factor nodes, which iteratively compute and pass messages to each other. FG's can be applied to solve decoding of forward error correcting codes, Markov chains and Markov Random Fields, Kaiman Filtering, Fourier Transforms, and even some games such as Sudoku. In this paper, a framework is presented for rapid prototyping of hardware implementations of FG-based applications. The FG developer specifies aspects of the application, and the framework returns a design. A system of Python scripts and Verilog Hardware Description Language templates together are used to generate the HDL source code for the application. The generated designs are vendor/platform agnostic, but currently target the Xilinx Virtex-6-based ML605. The framework has so far been primarily applied to construct Low Density Parity Check (LDPC) decoders. The characteristics of a large basket of generated LDPC decoders, including contemporary 802.11η decoders, have been examined as a verification of the system and as a demonstration of its capabilities. As a further demonstration, the framework has been applied to construct a Sudoku solver.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129699751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

FPGA-based accelerator development for non-engineers 基于fpga的非工程师加速器开发

2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14) Pub Date : 2014-04-15 DOI: 10.1109/ReConFig.2014.7032522

David Uliana, P. Athanas, Krzysztof Kepa

{"title":"FPGA-based accelerator development for non-engineers","authors":"David Uliana, P. Athanas, Krzysztof Kepa","doi":"10.1109/ReConFig.2014.7032522","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032522","url":null,"abstract":"In todays world of big-data computing, access to massive, complex data sets has reached an unprecedented level, and the task of intelligently processing such data into useful information has become a growing concern to the high-performance computing community. However, domain experts, who are the brains behind this processing, typically lack the skills required to build FPGA-based hardware accelerators ideal for their applications, as traditional development flows targeting such hardware require digital design expertise. This work proposes a usable, end-to-end accelerator development methodology that attempts to bridge this gap between domain-experts and the vast computational capacity of FPGA-based heterogeneous platforms. To accomplish this, a development flow was assembled, targeting the Convey Hybrid-Core HC-1 heterogeneous platform and utilizing an existing graphical design environment for design entry. The efficacy of the flow in extending FPGA-based acceleration to non-engineers in the life sciences was informally tested at an NSF-funded summer workshop, organized and hosted by a bioinformatics organization at a particular university. A group of five life-science-focused, non-engineer participants made significant modifications to a bare-bones Smith-Waterman accelerator, extending its functionality and improving performance.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125911553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

FPGA-based reconfigurable unit for real-time power quality index estimation 基于fpga的实时电能质量指标估计可重构单元

2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14) Pub Date : 2014-02-05 DOI: 10.1109/ReConFig.2014.7032521

M. Lopez-Ramirez, L. Ledesma-Carrillo, Ana L. Martinez-Herrera, E. Cabal-Yépez, H. Miranda-Vidales

引用次数: 7