2014 International Conference on Field-Programmable Technology (FPT)最新文献_第6页

An improved FPGA-based specific processor for Blokus Duo 改进的基于fpga的Blokus Duo专用处理器

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082822

J. Olivito, A. Delmas, J. Resano

引用次数: 0

A circuit to synchronize high speed serial communication channel 一种同步高速串行通信通道的电路

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082784

Mrinal J. Sarmah, Syed Azeemuddin

引用次数: 2

Development productivity in implementing a complex heterogeneous computing application 开发实现复杂异构计算应用程序的生产力

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082809

Anthony Milton, D. Kearney, S. Wong, S. Lemmo

{"title":"Development productivity in implementing a complex heterogeneous computing application","authors":"Anthony Milton, D. Kearney, S. Wong, S. Lemmo","doi":"10.1109/FPT.2014.7082809","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082809","url":null,"abstract":"The FPGA platform is increasingly faced with a multitude of competitor parallel computing architectures such as GPUs and various multicore variants. These competitor parallel platforms are attractive because they involve a software based development flow, resulting in greater developer productivity. While it has been argued that FPGA applications written in traditional hardware description languages (HDLs) may require nearly an order of magnitude more development time than corresponding parallel software development (PSD) for multi-core CPU or GPU, there are modern approaches to hardware design that drastically increase development productivity that are beginning to gain traction. One approach adopted in this work is use of the high-level HDL Bluespec. This paper compares Bluespec FPGA development with PSD for multi-core CPU and GPU, by detailing the experiences of a project that involved developing various components of a complex multi-object visual tracking algorithm for each of these platforms. We found that the development time using Bluespec was competitive with the combined development time for the CPU and GPU versions, but that limitations with the Bluespec development chain (such as lack of native floating-point support) and component integration issues with the FPGA design were areas of significant weakness for the FPGA platform. Finally, we present performance results for the various implementations of the visual tracking algorithm developed in this work, and show that the FPGA platform has the potential to exceed the performance of the CPU and GPU platforms when implementation issues can be overcome for this application.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"59 1","pages":"322-325"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77733568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Design space exploration for FPGA-based hybrid multicore architecture 基于fpga的混合多核架构的设计空间探索

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082795

Jian Yan, Junqi Yuan, Y. Wang, P. Leong, Lingli Wang

引用次数: 0

A survey on security and trust of FPGA-based systems 基于fpga的系统安全与信任研究综述

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082768

Jiliang Zhang, G. Qu

引用次数: 27

AMMC: Advanced Multi-Core Memory Controller 高级多核内存控制器

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082802

Tassadaq Hussain, Oscar Palomar, O. Unsal, A. Cristal, E. Ayguadé, M. Valero, Shakaib A. Gursal

引用次数: 7

Approaching overhead-free execution on FPGA soft-processors 在FPGA软处理器上接近无开销执行

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082760

Charles Eric LaForest, J. Anderson, J. Gregory Steffan

{"title":"Approaching overhead-free execution on FPGA soft-processors","authors":"Charles Eric LaForest, J. Anderson, J. Gregory Steffan","doi":"10.1109/FPT.2014.7082760","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082760","url":null,"abstract":"Implementing systems on FPGA soft-processors, rather than as custom hardware, eases and accelerates the development process, but at the cost of a great reduction in performance. Orthogonal to limitations in parallelism or clock frequency, this reduction in performance primarily originates in the intrinsic addressing and flow-control overheads of scalar microprocessors, which expend a considerable number of cycles interleaving address calculations and branch decisions within the actual useful work. We present an improved FPGA soft-processor architecture which statically overlaps \"overhead\" computations and executes them in parallel with the \"useful\" computations, significantly reducing the number of processor cycles needed to execute sequential programs, while reducing maximum clock frequency to 0.939x of its original value. In addition to eliminating almost all overhead computations, the proposed soft-processor can operate at 500 MHz on the Altera Stratix IV FPGA - 0.909x of the absolute maximum rating. Combined, the high speed and execution efficiency increase the range of FPGA designs amenable to soft-processors rather than custom hardware. We evaluate our cycle count improvements with multiple benchmarks, achieving speedups ranging from 1.07x for control-heavy code, to 1.92x for looping code, never performing worse than the original sequential code, and always performing better than a totally unrolled loop.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"6 1","pages":"99-106"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76106202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Scalable radio processor architecture for modern wireless communications 用于现代无线通信的可扩展无线电处理器架构

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082806

Young-Hwan Park, K. Prasad, Yeonbok Lee, Kitaek Bae, Ho Yang

{"title":"Scalable radio processor architecture for modern wireless communications","authors":"Young-Hwan Park, K. Prasad, Yeonbok Lee, Kitaek Bae, Ho Yang","doi":"10.1109/FPT.2014.7082806","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082806","url":null,"abstract":"In this paper, we propose an architecture of scalable radio processor targeting an OFDM based wireless modem. The architecture is based on the coarse-grained reconfigurable array (CGRA), which provides programmable and flexible accelerators by reconfiguring hardware resources at run time. On the other hand, the architecture maximizes the data parallelism by implementing 32-way SEVTD operations. Other features considered in the current implementation include mini-core structure, dedicated vector memory, and simplified datapath. The proposed architecture is compared to the precedent 4×4 CGRA processor, and evaluated with several communication kernels in terms of cycle, area and power. The implementation result shows that the proposed architecture has 3.6 times better in cycle performance with 2 times better scheduling but with double area penalty, resulting in 1495 cycles for complex 2K-FFT, to the best of our knowledge, that is the best DSP cycles reported until today. The synthesized results with 32nm library also show that the proposed architecture is operational at 800MHz, which is capable of running maximum 128 GOPS of wireless applications.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"1 1","pages":"310-313"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79618071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Integrating FPGA-based processing elements into a runtime for parallel heterogeneous computing 将基于fpga的处理元素集成到并行异构计算的运行时中

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082807

David de la Chevallerie, Jens Korinth, A. Koch

引用次数: 4

Online scheduling for FPGA computation in the Cloud 云环境下FPGA计算的在线调度

2014 International Conference on Field-Programmable Technology (FPT) Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082811

Guohao Dai, Yi Shan, Fei Chen, Yu Wang, Kun Wang, Huazhong Yang

{"title":"Online scheduling for FPGA computation in the Cloud","authors":"Guohao Dai, Yi Shan, Fei Chen, Yu Wang, Kun Wang, Huazhong Yang","doi":"10.1109/FPT.2014.7082811","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082811","url":null,"abstract":"The popularization and application of Cloud Computing have provided a new approach for users to get computing resources in recent years. Meanwhile, due to the advantages including programmability and power-efficiency, FPGAs have been applied to custom computing in many domains. Previous work has made resources of FPGA available under the cloud environment. However, the effective usage of FPGAs in the cloud requires efficient online task scheduling: to properly assign as many tasks from different tenants as possible to the FPGAs. In this paper, we propose a benefit-based scheduling metric to evaluate the task assignment Based on the metric, we accelerate task execution according to our benefit-based scheduling algorithms. By applying our benefit-based scheduling metric to a real OpenStack-based cloud environment, 60.32% computing resources are saved compared with the conventional throughput-based metric. Furthermore, a Replacement-Considering algorithm, which considers the task replacement, is proposed taking the characteristics of cloud into account. The results show that our FPGA accelerated cloud system is 1.386 times faster than using the previous algorithm.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"21 1","pages":"330-333"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81996199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20