2013 International Conference on Field-Programmable Technology (FPT)最新文献

筛选
英文 中文
Implementation of high performance hardware architecture of OpenSURF algorithm on FPGA OpenSURF算法的高性能硬件架构在FPGA上的实现
2013 International Conference on Field-Programmable Technology (FPT) Pub Date : 2013-12-01 DOI: 10.1109/FPT.2013.6718346
Xitian Fan, Chen-Mie Wu, Wei Cao, Xuegong Zhou, Shengye Wang, Lingli Wang
{"title":"Implementation of high performance hardware architecture of OpenSURF algorithm on FPGA","authors":"Xitian Fan, Chen-Mie Wu, Wei Cao, Xuegong Zhou, Shengye Wang, Lingli Wang","doi":"10.1109/FPT.2013.6718346","DOIUrl":"https://doi.org/10.1109/FPT.2013.6718346","url":null,"abstract":"This paper proposes a high performance hardware architecture of Speeded Up Robust Features (SURF) algorithm based on OpenSURF. In order to achieve high processing frame rate, the hardware architecture is designed with several characteristics. Firstly, a sliding window method is proposed to extract feature points in parallel at selected scale levels. As a result, the time cost in feature extraction can be greatly reduced. Secondly, data reuse strategy is proposed in orientation generation and descriptor generation to reduce the memory access times. In this way, 3.87x and 2.25X speedup are achieved respectively. Thirdly, the integral image is segmented to buffer in different memory blocks in order to support multiple data accessing in one clock cycle, which will further reduce the whole calculating time of our implementation. The hardware architecture is implemented on an XC6VSX475T FPGA with 156 MHz and its maximal frame rate for VGA format image can reach 356 frames per second (fps), which is 6.25 times frame rate of OpenSURF running on a server with a Xeon 5650 processor, and 6 times the reported frame rate of the recent implementation on three Vritex4 FPGAs [8].","PeriodicalId":344469,"journal":{"name":"2013 International Conference on Field-Programmable Technology (FPT)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129826625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
COFFE: Fully-automated transistor sizing for FPGAs 咖啡:全自动晶体管尺寸的fpga
2013 International Conference on Field-Programmable Technology (FPT) Pub Date : 2013-12-01 DOI: 10.1109/FPT.2013.6718327
Charles Chiasson, Vaughn Betz
{"title":"COFFE: Fully-automated transistor sizing for FPGAs","authors":"Charles Chiasson, Vaughn Betz","doi":"10.1109/FPT.2013.6718327","DOIUrl":"https://doi.org/10.1109/FPT.2013.6718327","url":null,"abstract":"In this paper, we present COFFE (Circuit Optimization For FPGA Exploration), a new fully-automated transistor sizing tool for FPGAs. Automated transistor-level CAD tools are an important part of the architecture exploration flow because they provide accurate area and delay estimates of low-level FPGA circuitry, which must be obtained for each architecture. We show that modeling transistors as linear resistances and capacitances as has been done in previous FPGA transistor sizing tools is highly inaccurate for fine-grained transistor-level design in advanced process nodes. Therefore, COFFE's transistor sizing algorithm maintains circuit non-linearities by relying exclusively on HSPICE simulations to measure delay. Area is estimated with a transistor size-based model that incorporates a number of improvements to enhance its accuracy in advanced process technologies versus prior methods. In addition to more accurate area and delay estimation, COFFE considers more layout effects than prior published work by automatically accounting for transistor and wire loads, which are computed based on architectural parameters and layout area. This new FPGA transistor sizing tool requires only several hours to produce high-quality transistor sizing results for an entire FPGA tile; a task that would normally take months of manual effort. We demonstrate COFFE's utility in FPGA architecture studies by investigating an important new architectural question at the logic-to-routing interface.","PeriodicalId":344469,"journal":{"name":"2013 International Conference on Field-Programmable Technology (FPT)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128502843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
High-level synthesis of dynamic data structures: A case study using Vivado HLS 动态数据结构的高级综合:使用Vivado HLS的案例研究
2013 International Conference on Field-Programmable Technology (FPT) Pub Date : 2013-12-01 DOI: 10.1109/FPT.2013.6718388
F. Winterstein, Samuel Bayliss, G. Constantinides
{"title":"High-level synthesis of dynamic data structures: A case study using Vivado HLS","authors":"F. Winterstein, Samuel Bayliss, G. Constantinides","doi":"10.1109/FPT.2013.6718388","DOIUrl":"https://doi.org/10.1109/FPT.2013.6718388","url":null,"abstract":"High-level synthesis promises a significant shortening of the FPGA design cycle when compared with design entry using register transfer level (RTL) languages. Recent evaluations report that C-to-RTL flows can produce results with a quality close to hand-crafted designs [1]. Algorithms which use dynamic, pointer-based data structures, which are common in software, remain difficult to implement well. In this paper, we describe a comparative case study using Xilinx Vivado HLS as an exemplary state-of-the-art high-level synthesis tool. Our test cases are two alternative algorithms for the same compute-intensive machine learning technique (clustering) with significantly different computational properties. We compare a data-flow centric implementation to a recursive tree traversal implementation which incorporates complex data-dependent control flow and makes use of pointer-linked data structures and dynamic memory allocation. The outcome of this case study is twofold: We confirm similar performance between the hand-written and automatically generated RTL designs for the first test case. The second case reveals a degradation in latency by a factor greater than 30× if the source code is not altered prior to high-level synthesis. We identify the reasons for this shortcoming and present code transformations that narrow the performance gap to a factor of four. We generalise our source-to-source transformations whose automation motivates research directions to improve high-level synthesis of dynamic data structures in the future.","PeriodicalId":344469,"journal":{"name":"2013 International Conference on Field-Programmable Technology (FPT)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125628172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 113
A non-intrusive portable fault injection framework to assess reliability of FPGA-based designs 一种非侵入式便携式故障注入框架,用于评估基于fpga设计的可靠性
2013 International Conference on Field-Programmable Technology (FPT) Pub Date : 2013-12-01 DOI: 10.1109/FPT.2013.6718397
Elyas Abolhassani Ghazaani, Zana Ghaderi, S. Miremadi
{"title":"A non-intrusive portable fault injection framework to assess reliability of FPGA-based designs","authors":"Elyas Abolhassani Ghazaani, Zana Ghaderi, S. Miremadi","doi":"10.1109/FPT.2013.6718397","DOIUrl":"https://doi.org/10.1109/FPT.2013.6718397","url":null,"abstract":"This paper proposes a full-featured fault injection framework to assess reliability of FPGA-based designs. The framework provides non-intrusiveness, portability, flexibility and performance in reliability evaluation of FPGA-based designs against adverse effects of SEUs. It works in a non-intrusive manner, allowing the reliability of ready-to-be-released designs to be assessed independently, without any intrusion into their place and route characteristics. We have studied implications of framework's intrusiveness into design under test by comparing proposed non-intrusive framework with previous intrusive methods; up to 5% deviation in the number of effective faults is observed in intrusive methods. Providing portability, the framework can be applied for a wide variety of FPGAs. Allowing the user to define desired parameters for different fault injection strategies confirms framework's flexibility. Finally, the framework performs the process of injecting faults, evaluating design and removing faults in about 17ms, on average.","PeriodicalId":344469,"journal":{"name":"2013 International Conference on Field-Programmable Technology (FPT)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114273718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Color configuration method for an optically reconfigurable gate array 用于光学可重构门阵列的颜色配置方法
2013 International Conference on Field-Programmable Technology (FPT) Pub Date : 2013-12-01 DOI: 10.1109/FPT.2013.6718400
Takumi Fujimori, Minoru Watanabe
{"title":"Color configuration method for an optically reconfigurable gate array","authors":"Takumi Fujimori, Minoru Watanabe","doi":"10.1109/FPT.2013.6718400","DOIUrl":"https://doi.org/10.1109/FPT.2013.6718400","url":null,"abstract":"This paper presents a proposal of a color configuration method for an optically reconfigurable gate array (ORGA). A conventional ORGA consists of a single-wavelength laser array to address configuration contexts. However, the new ORGA has lasers of some other wavelength inside a laser array. Consequently, the addressable number of configuration contexts can be increased.","PeriodicalId":344469,"journal":{"name":"2013 International Conference on Field-Programmable Technology (FPT)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133081479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive compression for instruction code of Coarse Grained Reconfigurable Architectures 粗粒度可重构体系结构指令码的自适应压缩
2013 International Conference on Field-Programmable Technology (FPT) Pub Date : 2013-12-01 DOI: 10.1109/FPT.2013.6718396
Moo-Kyoung Chung, Jun-Kyoung Kim, Yeon-Gon Cho, Soojung Ryu
{"title":"Adaptive compression for instruction code of Coarse Grained Reconfigurable Architectures","authors":"Moo-Kyoung Chung, Jun-Kyoung Kim, Yeon-Gon Cho, Soojung Ryu","doi":"10.1109/FPT.2013.6718396","DOIUrl":"https://doi.org/10.1109/FPT.2013.6718396","url":null,"abstract":"Coarse Grained Reconfigurable Architecture (CGRA) achieves high performance by exploiting instruction-level parallelism with software pipeline. Large instruction memory is, however, a critical problem of CGRA, which requires large silicon area and power consumption. Code compression is a promising technique to reduce the memory area, bandwidth requirements, and power consumption. We present an adaptive code compression scheme for CGRA instructions based on dictionary-based compression, where compression mode and dictionary contents are adaptively selected for each execution kernel and compression group. In addition, it is able to design hardware decompressor efficiently with two-cycle latency and negligible silicon overhead. The proposed method achieved an average compression ratio 0.52 in a CGRA of 16-functional unit array with the experiments of well-optimized applications.","PeriodicalId":344469,"journal":{"name":"2013 International Conference on Field-Programmable Technology (FPT)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132860165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Making domain-specific hardware synthesis tools cost-efficient
2013 International Conference on Field-Programmable Technology (FPT) Pub Date : 2013-12-01 DOI: 10.1109/FPT.2013.6718341
N. George, D. Novo, Tiark Rompf, Martin Odersky, P. Ienne
{"title":"Making domain-specific hardware synthesis tools cost-efficient","authors":"N. George, D. Novo, Tiark Rompf, Martin Odersky, P. Ienne","doi":"10.1109/FPT.2013.6718341","DOIUrl":"https://doi.org/10.1109/FPT.2013.6718341","url":null,"abstract":"Tools to design hardware at a high level of abstraction promise software-like productivity for hardware designs. Among them, tools like Spiral, HDL Coder, Optimus and MMAlpha target specific application domains and produce highly efficient implementations from high-level input specifications in a Domain Specific Language (DSL). But, developing similar domain-specific High-Level Synthesis (HLS) tools need enormous effort, which might offset their many advantages. In this paper, we propose a novel, cost-effective approach to develop domain-specific HLS tools. We develop the HLS tool by embedding its input DSL in Scala and using Lightweight Modular Staging (LMS), a compiler framework written in Scala, to perform optimizations at different abstraction levels. For example, to optimize computation on matrices, some optimizations are more effective when the program is represented at the level of matrices while others are better applied at the level of individual matrix elements. To illustrate the proposed approach, we create an HLS flow to automatically generate efficient hardware implementations of matrix expressions described in our own high-level specification language. Although a simple example, it shows how easy it is to reuse modules across different HLS flows and to integrate our flow with existing tools like LegUp, a C-to-RTL compiler, and FloPoCo, an arithmetic core generator. The results reveal that our approach can simultaneously achieve high productivity and design quality with a very reasonable tool development effort.","PeriodicalId":344469,"journal":{"name":"2013 International Conference on Field-Programmable Technology (FPT)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122350382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
A hardware acceleration of a phylogenetic tree reconstruction with maximum parsimony algorithm using FPGA 基于FPGA的系统发育树重构的最大简约算法硬件加速
2013 International Conference on Field-Programmable Technology (FPT) Pub Date : 2013-12-01 DOI: 10.1109/FPT.2013.6718376
Henry Block, T. Maruyama
{"title":"A hardware acceleration of a phylogenetic tree reconstruction with maximum parsimony algorithm using FPGA","authors":"Henry Block, T. Maruyama","doi":"10.1109/FPT.2013.6718376","DOIUrl":"https://doi.org/10.1109/FPT.2013.6718376","url":null,"abstract":"In this paper, we present a hardware acceleration approach for a phylogenetic tree reconstruction with maximum parsimony algorithm using FPGA. The algorithm is based on a stochastic local search with the progressive tree neighborhood. The hardware architecture is divided in different units, each of which performs a specific task of the algorithm, to take advantage of the parallel processing capabilities of the FPGA. We show results for four real-world biological datasets, and compare them against results from two programs: our C++ implementation and TNT (a program for phylogenetic analysis). High acceleration rates are obtained against our C++ implementation, but not against TNT, which even shows to be faster in some cases. We conclude our work with a discussion on this issue.","PeriodicalId":344469,"journal":{"name":"2013 International Conference on Field-Programmable Technology (FPT)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124599841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A speculative gather system for Cool Mega-Array 一个投机的收集系统为酷巨型阵列
2013 International Conference on Field-Programmable Technology (FPT) Pub Date : 2013-12-01 DOI: 10.1109/FPT.2013.6718383
Rie Uno, N. Ozaki, Mai Izawa, Akihito Tsusaka, Takaaki Miyajima, H. Amano
{"title":"A speculative gather system for Cool Mega-Array","authors":"Rie Uno, N. Ozaki, Mai Izawa, Akihito Tsusaka, Takaaki Miyajima, H. Amano","doi":"10.1109/FPT.2013.6718383","DOIUrl":"https://doi.org/10.1109/FPT.2013.6718383","url":null,"abstract":"Cool Mega Array (CMA) is a low power reconfigurable processor array for battery driven mobile devices. A prototype chip CMA-1 consists of a 8 × 8 PE (Processing Element) array and a micro-controller for controlling data alignment. Because the PE array of CMA is built with a combinatorial circuit, it does not have a signal which tells that operation in the PE array was completed. A propagate delay of the whole PE array corresponding to the operation time was estimated by using the data path and mapping information in the design stage of the application. The timing information for gathering the data was specified in the microcode of the controller. However, since this timing is fixed, it cannot treat the variation of environment temperature and voltage scaling for the PE array. Here, a speculative gather system is proposed which sets the timing of collecting operation results from the PE array dynamically. By collecting results twice and comparing them, it guarantees the correctness of the operation results and adjusts the gather timing automatically. The speculative gather system is implemented in the CMA, and evaluation results appear that the performance is improved by 25.3% on average with the overhead of 0.5% in area and 3.1% in power consumption.","PeriodicalId":344469,"journal":{"name":"2013 International Conference on Field-Programmable Technology (FPT)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115973339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
FPGA-accelerated key search for cold-boot attacks against AES 针对AES的冷启动攻击的fpga加速密钥搜索
2013 International Conference on Field-Programmable Technology (FPT) Pub Date : 2013-12-01 DOI: 10.1109/FPT.2013.6718394
Heinrich Riebler, Tobias Kenter, Christoph Sorge, Christian Plessl
{"title":"FPGA-accelerated key search for cold-boot attacks against AES","authors":"Heinrich Riebler, Tobias Kenter, Christoph Sorge, Christian Plessl","doi":"10.1109/FPT.2013.6718394","DOIUrl":"https://doi.org/10.1109/FPT.2013.6718394","url":null,"abstract":"Cold-boot attacks exploit the fact that DRAM contents are not immediately lost when a PC is powered off. Instead the contents decay rather slowly, in particular if the DRAM chips are cooled to low temperatures. This effect opens an attack vector on cryptographic applications that keep decrypted keys in DRAM. An attacker with access to the target computer can reboot it or remove the RAM modules and quickly copy the RAM contents to non-volatile memory. By exploiting the known cryptographic structure of the cipher and layout of the key data in memory, in our application an AES key schedule with redundancy, the resulting memory image can be searched for sections that could correspond to decayed cryptographic keys; then, the attacker can attempt to reconstruct the original key. However, the runtime of these algorithms grows rapidly with increasing memory image size, error rate and complexity of the bit error model, which limits the practicability of the approach. In this work, we study how the algorithm for key search can be accelerated with custom computing machines. We present an FPGA-based architecture on a Maxeler dataflow computing system that outperforms a software implementation up to 205x, which significantly improves the practicability of cold-attacks against AES.","PeriodicalId":344469,"journal":{"name":"2013 International Conference on Field-Programmable Technology (FPT)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121933668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信