2013 23rd International Conference on Field programmable Logic and Applications最新文献

筛选
英文 中文
A hardware complete detection mechanism for an energy efficient reconfigurable accelerator CMA 一种节能可重构加速器CMA的硬件完整检测机制
2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645594
Akihito Tsusaka, Mai Izawa, Rie Uno, Nobuyuki Ozaki, H. Amano
{"title":"A hardware complete detection mechanism for an energy efficient reconfigurable accelerator CMA","authors":"Akihito Tsusaka, Mai Izawa, Rie Uno, Nobuyuki Ozaki, H. Amano","doi":"10.1109/FPL.2013.6645594","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645594","url":null,"abstract":"Cool Mega Array (CMA) is an energy efficient Coarse Grained Reconfigurable processor Array (CGRA) consisting of a large PE (Processing Element) array. In order to reduce the power for storing intermediate results and clock tree, the PE array is consisting of combinatorial circuits. A hardware completion detection mechanism for CMA is proposed, implemented and evaluated. Each PE uses serially connected buffers with selectable taps, and the delay is decided according to the operation executed in the PE. Since the completion signal is transferred exactly on the same paths that for computation, the delay in the switch and wires are accounted. The post layout simulation revealed that the same performance without the mechanism can be obtained only with 5.1% area overhead and less than 6% extra power consumption. With the mechanism, a single micro-code can be used for various supply voltages to PE array.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133639897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Degradation in FPGAs: Monitoring, modeling and mitigation (PHD forum paper: Thesis broad overview) fpga的退化:监测、建模和缓解(博士论坛论文:论文概述)
2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645614
A. Amouri, M. Tahoori
{"title":"Degradation in FPGAs: Monitoring, modeling and mitigation (PHD forum paper: Thesis broad overview)","authors":"A. Amouri, M. Tahoori","doi":"10.1109/FPL.2013.6645614","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645614","url":null,"abstract":"The continuous shrinking of CMOS transistors in the nano-scale era poses many manufacturing and reliability challenges such as process variation, sub-threshold leakage, power dissipation, increased circuit noise sensitivity, and reliability concerns due to transient (e.g. radiation-induced soft errors) and permanent (e.g. transistor aging) failures [1, 2]. State-of-the-art FPGAs, pushed by the ever-increasing demands on higher performance and lower power, use the latest advancements in CMOS technology [3, 4], and thus they share most of these challenges. Therefore, to guarantee the required lifetime of FPGA-mapped systems in the field, proper techniques at various levels should be devised. Transistor aging, as an important factor, causes an increase in the magnitude of threshold voltage, which in turn slows down the switching speed of the transistor and leads to timing failures and faster wear-out rates [5]. To properly deal with this issue in FPGAs, it requires modeling, monitoring and mitigation at device and architecture levels as well as the tool-chain at user level.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130268747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Image recognition operation on a dynamically reconfigurable vison architecture 基于动态可重构视觉架构的图像识别操作
2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645603
Yuki Kamikubo, Minoru Watanabe, S. Kawahito
{"title":"Image recognition operation on a dynamically reconfigurable vison architecture","authors":"Yuki Kamikubo, Minoru Watanabe, S. Kawahito","doi":"10.1109/FPL.2013.6645603","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645603","url":null,"abstract":"Recently, for use in autonomous vehicles and robots, demand has been increasing for high-speed image recognition that is superior to that of the human eye. However, to recognize numerous images quickly, such systems require many template images to be read out dynamically from memory. They must then be sent to a processor quickly. Achieving such high-speed real-time image recognition operation is difficult because of the bottleneck of the transfer between the memory and the processor. To alleviate that bottleneck, a dynamically reconfigurable vision architecture was proposed. This paper presents 16-gray scale image recognition operation of the proposed architecture.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130416220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient FPGA overlay for portable custom instruction set extensions 一个有效的FPGA覆盖便携式自定义指令集扩展
2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645517
Dirk Koch, Christian Beckhoff, G. Lemieux
{"title":"An efficient FPGA overlay for portable custom instruction set extensions","authors":"Dirk Koch, Christian Beckhoff, G. Lemieux","doi":"10.1109/FPL.2013.6645517","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645517","url":null,"abstract":"Custom instruction set extensions can substantially boost performance of reconfigurable softcore CPUs. While this approach is commonly tailored to one specific FPGA system, we are presenting a fine-grained FPGA-like overlay architecture which can be implemented in the user logic of various FPGA families from different vendors. This allows the execution of a portable application consisting of a program binary and an overlay configuration in a completely heterogeneous environment. Furthermore, we are presenting different optimizations for dramatically reducing the implementation cost of the proposed overlay architecture. In particular, this includes the mapping of the overlay interconnection network directly into the switch fabric of the hosting FPGA. Our case study demonstrates an overhead reduction of an order of magnitude as compared to related approaches.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130450544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
FPGA-accelerated sliding window classifier with structured features 具有结构化特征的fpga加速滑动窗口分类器
2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645560
Ondrej Sychrovsky, Martin Matousek, R. Sára
{"title":"FPGA-accelerated sliding window classifier with structured features","authors":"Ondrej Sychrovsky, Martin Matousek, R. Sára","doi":"10.1109/FPL.2013.6645560","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645560","url":null,"abstract":"Certain classification tasks in computer vision require the classifier response to be computed in every pixel of an image. When combined with large, complex features, it becomes challenging to build such a classifier on a standard PC architecture and achieve real-time performance. We present an FPGA implementation of a car wheel classifier response computation, built as an instantiation of a generic classification system. An interesting optimization problem concerning performance and speed is addressed. Our implementation is running in real-time as a part of a more complex collision mitigation system based on car detection in video data.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"266 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134255356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
In pursuit of instant gratification for FPGA design 为了追求即时满足而进行FPGA设计
2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645505
A. Love, Wenwei Zha, P. Athanas
{"title":"In pursuit of instant gratification for FPGA design","authors":"A. Love, Wenwei Zha, P. Athanas","doi":"10.1109/FPL.2013.6645505","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645505","url":null,"abstract":"This paper describes an alternative FPGA design compilation flow to reduce the back-end time required to implement a Xilinx FPGA design. Using a library of precompiled modules and associated meta-data, bitstream-level assembly of desired designs can occur in a fraction of the time of traditional back-end tools. Modules are bound, placed, and routed using custom bitstream assembly with the primary objective of rapid compilation while preserving performance. Since vendor tools are not needed for assembly, compilation can be performed in embedded and/or untethered environments. As a result, large device compilations can be assembled in seconds. This turbo flow (TFlow) enables software-like turn-around time for faster prototyping and increased productivity.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130065769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Runtime assertions and exceptions for streaming systems 流系统的运行时断言和异常
2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645597
T. Todman, W. Luk
{"title":"Runtime assertions and exceptions for streaming systems","authors":"T. Todman, W. Luk","doi":"10.1109/FPL.2013.6645597","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645597","url":null,"abstract":"We present an approach to enable run-time, in-circuit assertions and exceptions in reconfigurable hardware designs. Static, compile-time checking, including formal verification, can catch many errors before a reconfigurable design is implemented. However, many other errors cannot be caught by static approaches, including those due to run-time data. Our approach allows users to add run-time assertions and exceptions to a design, giving multiple ways to handle run-time errors. Our work includes an abstract approach to adding assertions and exceptions to a design, a concrete implementation for Maxeler streaming designs, and an evaluation. Results show low overhead for adding exceptions to a design.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132239411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
TputCache: High-frequency, multi-way cache for high-throughput FPGA applications TputCache:用于高吞吐量FPGA应用的高频多路缓存
2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645537
Aaron Severance, G. Lemieux
{"title":"TputCache: High-frequency, multi-way cache for high-throughput FPGA applications","authors":"Aaron Severance, G. Lemieux","doi":"10.1109/FPL.2013.6645537","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645537","url":null,"abstract":"Throughput processing involves using many different contexts or threads to solve multiple problems or subproblems in parallel, where the size of the problem is large enough that latency can be tolerated. Bandwidth is required to support multiple concurrent executions, however, and utilizing multiple external memory channels is costly. For small working sets, FPGA designers can use on-chip BRAMs achieve the necessary bandwidth without increasing the system cost. Designing algorithms around fixed-size local memories is difficult, however, as there is no graceful fallback if the problem size exceeds the amount of local memory. This paper introduces TputCache, a cache designed to meet the needs of throughput processing on FPGAs, giving the throughput performance of on-chip BRAMs when the problem size fits in local memory. The design utilizes a replay based architecture to achieve high frequency with very low resource overheads.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"7 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122193228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Comparing and combining GPU and FPGA accelerators in an image processing context 在图像处理环境中比较和组合GPU和FPGA加速器
2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645552
B. Silva, An Braeken, E. D'Hollander, A. Touhafi, Jan G. Cornelis, J. Lemeire
{"title":"Comparing and combining GPU and FPGA accelerators in an image processing context","authors":"B. Silva, An Braeken, E. D'Hollander, A. Touhafi, Jan G. Cornelis, J. Lemeire","doi":"10.1109/FPL.2013.6645552","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645552","url":null,"abstract":"Nowadays, processors alone cannot deliver what computation hungry image processing applications demand. An alternative is to use hardware accelerators such as Graphics Processing Units (GPUs) or Field Programmable Gate Arrays (FPGAs). Applications, however, exhibit different performance characteristics depending on the accelerator. This paper describes the hybrid platform and the programming environment that allows to efficiently create programs on a combined GPU/FPGA desktop. We use the roofline model to identify the most appropriate accelerator for each application and High-Level Synthesis (HLS) tools to reduce the FPGA development time. To introduce our platform and tool chain both accelerators are compared by implementing a basic image operation. Next, a promising algorithm is explored and implemented, splitting and distributing the work between GPU, FPGA and CPU in order to validate the hybrid concept. Our results show that their combination exhibits a higher performance for computational intensive image processing applications than a GPU only.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125243992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Parallel and scalable custom computing for real-time fluid simulation on a cluster node with four tightly-coupled FPGAs 并行和可扩展的定制计算实时流体模拟上的集群节点与四个紧密耦合的fpga
2013 23rd International Conference on Field programmable Logic and Applications Pub Date : 2013-10-24 DOI: 10.1109/FPL.2013.6645625
K. Sano, R. Ito, Hayato Suzuki, Yoshiaki Kono
{"title":"Parallel and scalable custom computing for real-time fluid simulation on a cluster node with four tightly-coupled FPGAs","authors":"K. Sano, R. Ito, Hayato Suzuki, Yoshiaki Kono","doi":"10.1109/FPL.2013.6645625","DOIUrl":"https://doi.org/10.1109/FPL.2013.6645625","url":null,"abstract":"Summary form only given. Numerical simulation based on computational fluid dynamics (CFD) is now an indispensable technique especially in industry due to its acquisition capability of various data at a lower cost than experiments using a wind tunnel. The lattice Boltzmann method (LBM) is one of the CFD schemes, which is used to compute various problems including multiphase flow. LBM has good parallelism, but simultaneously requires many data to compute each lattice point, resulting in a low operational intensity. Consequently, the sustained performance of LBM is limited by memory bandwidth rather than arithmetic performance when computed by using general-purpose processors and GPUs. To make matters worse, insufficient bandwidth and high-latency of an interconnection network cause a relatively big overhead in parallel computing, especially in the case of strong-scaling.","PeriodicalId":200435,"journal":{"name":"2013 23rd International Conference on Field programmable Logic and Applications","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127386809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信