2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)最新文献

筛选
英文 中文
Area-driven partial reconfiguration for SEU mitigation on SRAM-based FPGAs 基于sram的fpga的区域驱动部分重构
2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig) Pub Date : 2016-11-01 DOI: 10.1109/ReConFig.2016.7857154
M. Vavouras, C. Bouganis
{"title":"Area-driven partial reconfiguration for SEU mitigation on SRAM-based FPGAs","authors":"M. Vavouras, C. Bouganis","doi":"10.1109/ReConFig.2016.7857154","DOIUrl":"https://doi.org/10.1109/ReConFig.2016.7857154","url":null,"abstract":"This paper presents an area-driven Field-Programmable Gate Array (FPGA) scrubbing technique based on partial reconfiguration for Single Event Upset (SEU) mitigation. The proposed method is compared with existing techniques such as blind and on-demand scrubbing on a novel SEU mitigation framework implemented on the ZYNQ platform, supporting various SEU and scrubbing rates. A design space exploration on the availability versus data transfers from a Double Data Rate Type 3 (DDR3) memory, shows that our approach outperforms blind scrubbing for a range of availability values when a second order polynomial IP is targeted. A comparison to an existing on-demand scrubbing technique based on Dual Modular Redundancy (DMR) shows that our approach saves up to 46% area for the same case study.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125796530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Hybrid energy-aware reconfiguration management on Xilinx Zynq SoCs 基于Xilinx Zynq soc的混合能量感知重构管理
2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig) Pub Date : 2016-11-01 DOI: 10.1109/ReConFig.2016.7857177
Andreas Becher, Jutta Pirkl, Achim Herrmann, J. Teich, S. Wildermann
{"title":"Hybrid energy-aware reconfiguration management on Xilinx Zynq SoCs","authors":"Andreas Becher, Jutta Pirkl, Achim Herrmann, J. Teich, S. Wildermann","doi":"10.1109/ReConFig.2016.7857177","DOIUrl":"https://doi.org/10.1109/ReConFig.2016.7857177","url":null,"abstract":"Partial Reconfiguration is a common technique on FPGA platforms to load hardware accelerators at runtime without interrupting the remaining system. One crucial element is the time needed for reconfiguration as it affects usability, performance and energy consumption. Furthermore, many systems have to share partial areas between multiple applications and users. In this paper, we introduce a novel open-source reconfiguration manager for Xilinx Zynq SoCs which a) allows partial area sharing and b) includes a hybrid reconfiguration approach utilizing both the Processor Configuration Access Port (PCAP) and the Internal Configuration Access Port (ICAP) in order to minimize reconfiguration time and system energy consumption. We evaluate our design and identify the sweet spots between energy consumption and latency of accelerator availability with an example use case. By means of the hybrid approach, a speedup for the full configuration after powering on the FPGA of up to 64 % in comparison to solely using the PCAP interface can be achieved.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131159987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Reconfigurable computing for network function virtualization: A protocol independent switch 用于网络功能虚拟化的可重构计算:一个协议独立的交换机
2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig) Pub Date : 2016-11-01 DOI: 10.1109/ReConFig.2016.7857183
Qianqiao Chen, Vaibhawa Mishra, G. Zervas
{"title":"Reconfigurable computing for network function virtualization: A protocol independent switch","authors":"Qianqiao Chen, Vaibhawa Mishra, G. Zervas","doi":"10.1109/ReConFig.2016.7857183","DOIUrl":"https://doi.org/10.1109/ReConFig.2016.7857183","url":null,"abstract":"Network function virtualization (NFV) aims to decouple software network applications from their hardware in order to reduce development and deployment costs for new services. To enable the deployment of diverse network services, a reconfigurable and high performance hardware platform can bring considerable benefits to NFV. In this paper, an FPGA-based platform is proposed to perform as a protocol reconfigurable NFV switch. Logic circuit of virtual network functions can be reconfigured at run time on the proposed platform. A reconfiguration process is also proposed to enable packet loss free switch-over between virtual network functions that delivers undisrupted service. The platform can be reconfigured between Layer 1 circuit switch and Layer 2 Ethernet packet switch. Once running as a packet switch, the platform can switch over from Layer 2 Ethernet switch to Layer 3 IP parser and even Layer 4 UDP parser. Performance of the implemented 2×2 switch at 10Gbps per port delivers a minimum latency of 300 nanoseconds (circuit switch) and maximum latency of 1 microsecond. Reconfiguration between IP and UDP parser without loss of data is also demonstrated.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128989023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Dataflow optimization for programmable embedded image preprocessing accelerators 可编程嵌入式图像预处理加速器的数据流优化
2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig) Pub Date : 2016-11-01 DOI: 10.1109/ReConFig.2016.7857161
T. Lieske, M. Reichenbach, Burkhard Ringlein, D. Fey
{"title":"Dataflow optimization for programmable embedded image preprocessing accelerators","authors":"T. Lieske, M. Reichenbach, Burkhard Ringlein, D. Fey","doi":"10.1109/ReConFig.2016.7857161","DOIUrl":"https://doi.org/10.1109/ReConFig.2016.7857161","url":null,"abstract":"Image processing is an omnipresent topic in current embedded industrial and consumer applications. Therefore, it is important to investigate processing architectures to extract design guidelines for developing efficient image processors. While SIMD (single instruction, multiple data) processor arrays were often proposed to accelerate image processing tasks, the internal architecture of processor elements (PEs) has not been optimized. Nevertheless, it is necessary to evaluate the optimal complexity of PEs to trade off performance and architectural overhead caused by complex processor architectures. Hence, the goal of this paper is to present a deep evaluation of finding the right architectural complexity of PEs in a processor field to meet given performance and logic area constraints. In order to determine the optimal complexity, the ADL (architecture description language) based FAUPU framework for image preprocessing architectures is utilized and after evaluation extended with pipelining support. The newly introduced pipelining features enable resource-efficient performance optimizations and are a significant improvement to the FAUPU ADL. Due to the fine-grained configurability of the FAUPU architecture, several design variants can be easily generated and it is possible to evaluate the effects of instruction set architecture (ISA) complexity and pipelining on design properties and how these features are best combined. Consequently, the FAUPU framework can be used to address the question, whether it is better to use many lightweight cores or do less but more complex cores yield a greater performance to area ratio? The results show that lightweight cores are best suited to achieve a targeted frame rate with the least resources. However, more complex cores on the other hand yield better performance to area ratios.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129159098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
RePaBit: Automated generation of relocatable partial bitstreams for Xilinx Zynq FPGAs RePaBit: Xilinx Zynq fpga的可重定位部分位流的自动生成
2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig) Pub Date : 2016-11-01 DOI: 10.1109/ReConFig.2016.7857186
J. Rettkowski, Konstantin Friesen, D. Göhringer
{"title":"RePaBit: Automated generation of relocatable partial bitstreams for Xilinx Zynq FPGAs","authors":"J. Rettkowski, Konstantin Friesen, D. Göhringer","doi":"10.1109/ReConFig.2016.7857186","DOIUrl":"https://doi.org/10.1109/ReConFig.2016.7857186","url":null,"abstract":"Partial reconfiguration in FPGAs increases the flexibility of a system due to dynamic replacement of hardware modules. However, more memory is needed to store all partial bitstreams and the generation of all partial bitstreams for all possible regions on the FPGA is very time-consuming. In order to overcome these issues, bitstream relocation can be used. In this paper, a novel approach that facilitates bitstream relocation with the Xilinx Vivado tool flow is presented. In addition, the approach is automated by TCL scripts that extend Vivado to RePaBit. RePaBit is successfully evaluated on the Xilinx Zynq FPGA using 1D and 2D relocation of complex modules such as MicroBlaze processors. The results show a negligible overhead in terms of area and frequency while enabling more flexibility by partial bitstream relocation as well as a faster design time.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130271876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Automated synthesis of FPGA-based packet filters for 100 Gbps network monitoring applications 用于100gbps网络监控应用的基于fpga的包过滤器的自动合成
2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig) Pub Date : 2016-11-01 DOI: 10.1109/ReConFig.2016.7857156
J. F. Zazo, S. López-Buedo, G. Sutter, J. Aracil
{"title":"Automated synthesis of FPGA-based packet filters for 100 Gbps network monitoring applications","authors":"J. F. Zazo, S. López-Buedo, G. Sutter, J. Aracil","doi":"10.1109/ReConFig.2016.7857156","DOIUrl":"https://doi.org/10.1109/ReConFig.2016.7857156","url":null,"abstract":"Monitoring 100 Gbps network links is a challenging task. Packet filtering allows monitoring applications to focus on the relevant data, discarding packets that do not provide any valuable information. However, such a large line rate calls for custom hardware solutions. This work presents a tool for automatically synthesizing packets filters from a custom grammar, which defines filters in a human-readable format. Thanks to parser generators (Bison) and lexical analyzers (Flex), Verilog code is automatically generated from the filter specification. Rules can be applied over a protocol, a protocol field, the packet payload, or a combination of them. The generated filters use standard AXI4-Stream interfaces, which seamlessly integrate in the packet filtering framework that we have developed for the integrated block for 100G Ethernet available in Xilinx Ultrascale devices. We present the results for two proof-of-concept packet filtering designs. Furthermore, filters are fully pipelined, so the full 100 Gb/s rate is guaranteed. As the framework uses a cut-through approach, latency is kept to a minimum. Finally, the proposed framework allows for the integration of more complex payload-level filters, written in C language with the Vivado-HLS tool.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130282312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Packing a modern Xilinx FPGA using RapidSmith 使用RapidSmith封装现代赛灵思FPGA
2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig) Pub Date : 2016-11-01 DOI: 10.1109/ReConFig.2016.7857180
Travis Haroldsen, B. Nelson, B. Hutchings
{"title":"Packing a modern Xilinx FPGA using RapidSmith","authors":"Travis Haroldsen, B. Nelson, B. Hutchings","doi":"10.1109/ReConFig.2016.7857180","DOIUrl":"https://doi.org/10.1109/ReConFig.2016.7857180","url":null,"abstract":"Academic packing algorithms have typically been limited to theoretical architectures. In this paper, we describe RSVPack, a packing algorithm built on top of RapidSmith to target the Xilinx Virtex 6 architecture. We integrate our packer into the Xilinx ISE CAD flow and demonstrate our packer tool by packing a set of benchmark circuits and performing routing and timing analysis inside ISE.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116358327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Thread shadowing: On the effectiveness of error detection at the hardware thread level 线程阴影:关于硬件线程级别错误检测的有效性
2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig) Pub Date : 2016-11-01 DOI: 10.1109/ReConFig.2016.7857193
S. Meisner, M. Platzner
{"title":"Thread shadowing: On the effectiveness of error detection at the hardware thread level","authors":"S. Meisner, M. Platzner","doi":"10.1109/ReConFig.2016.7857193","DOIUrl":"https://doi.org/10.1109/ReConFig.2016.7857193","url":null,"abstract":"Dynamic thread duplication is a known redundancy technique for multi-cores. Recent research applied this concept to hybrid multi-cores for error detection and introduced thread shadowing that runs hardware threads in the reconfigurable cores and compares their outputs for deviation at configurable signature levels. Previously published work evaluated this concept in terms of performance, error detection latency and resource consumption. In this paper we report on the error detection capabilities of thread shadowing by presenting an extensive fault injection campaign. We employ the Xilinx Soft Error Mitigation Controller for fault injection and the Xilinx Essential Bit facility to limit the fault injections to relevant bits in the configuration bitstream. Our findings from fault injection experiments with a sorting benchmark are threefold: First, up to 98% of all errors are detected by the operating system of the hybrid multi-core supported by thread shadowing. Second, thread shadowing's signature levels provide a useful trade-off between detected errors and effort needed, with around 5% of all errors detected in calls to operating system functions and around 52% of errors detected in memory accesses of the hardware thread. Third, essential bit testing is effective and cuts down the amount of bits to be tested by a factor of 14.48 compared to the total amount of bits available in the configuration address space.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115398969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A scalable latency-insensitive architecture for FPGA-accelerated semi-global matching in stereo vision applications 立体视觉应用中fpga加速半全局匹配的可扩展延迟不敏感架构
2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig) Pub Date : 2016-11-01 DOI: 10.1109/ReConFig.2016.7857147
Jaco A. Hofmann, Jens Korinth, A. Koch
{"title":"A scalable latency-insensitive architecture for FPGA-accelerated semi-global matching in stereo vision applications","authors":"Jaco A. Hofmann, Jens Korinth, A. Koch","doi":"10.1109/ReConFig.2016.7857147","DOIUrl":"https://doi.org/10.1109/ReConFig.2016.7857147","url":null,"abstract":"Semi-Global Matching (SGM) is a high-performance method for computing high-quality disparity maps from stereo camera images in machine vision applications. It is also suitable for direct hardware execution, e.g., in ASICs or reconfigurable logic devices. We present a highly parametrized FPGA implementation, scalable from simple low-resolution low-power use-cases, up to complex real-time full-HD multi-camera scenarios. By using a latency-insensitive design style, high-level synthesis from the Bluespec SystemVerilog next-generation hardware description language, and an automated design-space exploration flow, many implementation alternatives could be examined with high productivity. The use of the Threadpool Composer system-on-chip assembly tool allows the portable mapping of the SGM accelerator to different hardware platforms. The accelerator performance exceeds that of prior fixed-architecture approaches.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125003678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Overloaded CDMA interconnect for Network-on-Chip (OCNoC) 面向片上网络(OCNoC)的过载CDMA互连
2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig) Pub Date : 2016-11-01 DOI: 10.1109/ReConFig.2016.7857179
K. E. Ahmed, M. Rizk, Mohammed M. Farag
{"title":"Overloaded CDMA interconnect for Network-on-Chip (OCNoC)","authors":"K. E. Ahmed, M. Rizk, Mohammed M. Farag","doi":"10.1109/ReConFig.2016.7857179","DOIUrl":"https://doi.org/10.1109/ReConFig.2016.7857179","url":null,"abstract":"Networks on Chip (NoCs) have replaced on-chip buses as the paramount communication strategy in large scale Systems-on-Chips (SoCs). Code Division Multiple Access (CDMA) has been proposed as an interconnect fabric that can achieve high throughput and fixed transfer latency due to the CDMA transmission concurrency. Overloaded CDMA Interconnect (OCI) is an architectural evolution of the conventional CDMA interconnects that can double their bandwidth at marginal cost. Employing OCI in CDMA-based NoCs has the potential of providing higher bandwidth at low-power and -area overheads compared to other NoC architectures. Furthermore, fixed latency and predictable performance achieved by the inherent CDMA concurrency can reduce the effort and overhead required to implement QoS. In this work, we advance the Overloaded CDMA interconnect for Network on Chip (OCNoC) dynamic central router. The OCNoC router leverages the overloaded CDMA concept to reduce the overall packet transfer latency and improve the network throughput at a negligible area overhead. Dynamic code assignment is adopted to reduce the decoding complexity and transfer latency and maximize the crossbar utilization. Two OCNoC solutions are advanced, serial and parallel CDMA encoding schemes. The OCNoC central routers are implemented and validated on a Virtex-7 VC709 FPGA kit. Evaluation results show a throughput enhancement up to 142% with a 1.7% variation in packet latencies. Synthesized using a 65 nm ASIC standard cell library, the presented ASIC OCNoC router requires 61% less area per processing element at 81.5% saving in energy dissipation compared to conventional CDMA-based NoCs.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129369088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信