Amit Kulkarni, Tom Davidson, Karel Heyse, D. Stroobandt
{"title":"Improving reconfiguration speed for dynamic circuit specialization using placement constraints","authors":"Amit Kulkarni, Tom Davidson, Karel Heyse, D. Stroobandt","doi":"10.1109/ReConFig.2014.7032534","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032534","url":null,"abstract":"Dynamic Circuit Specialization (DCS) is an optimization technique used for implementing a parameterized application on an FPGA. The application is said to be parameterized when some of its inputs, called parameters, are infrequently changing compared to the other inputs. Instead of implementing these parameter inputs as regular inputs, in the DCS approach these inputs are implemented as constants and the design is optimized for these constants. When the parameter values change, the design is re-optimized for the new constant values by reconfiguring the FPGA. It has been investigated that run-time reconfiguration speed is the limiting factor of the DCS implementations on Xilinx FPGAs. We propose an idea to constrain the design's placement and use the custom Xilinx HWICAP driver to improve reconfiguration speed at the cost of a small reduction in design performance. We use Xilinx Virtex-5 and Zynq-SoC as experimental platforms and we have used an 8-bit FIR filter with different tap configurations as our parameterized design whose filter coefficient values are infrequently changing inputs. A drastic improvement in the reconfiguration speed with a factor of 14 is achieved with only a ≈ 6% decrease in performance.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122117121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An AXI compatible cypress EZ-USB FX3 interface for USB-3.0 SuperSpeed","authors":"Benedikt Janßen, M. Hübner, T. Jaeschke","doi":"10.1109/ReConFig.2014.7032498","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032498","url":null,"abstract":"In this paper, we introduce an IP-core which operates as glue logic between the programmable logic of a FPGA and a Cypress EZ-USB FX3 (FX3) USB-3.0 transceiver. The developed IP core communicates with other logic via AXI-4 and enables half duplex connections between the linked logic and the FX3. Thereby the platform can be used by several applications which need a high speed communication to a USB-3.0 capable host. We chose an Enclustra Mercury KX1 board for implementation, which provides the FX3 USB-3.0 transceiver chip, a Xilinx Kintex-7 FPGA and 1 GB of DDR3 SDRAM. As an application we present a radar system. The stream of radar data is received inside the Kintex-7 by a Xilinx JESD204 IP-core and buffered in the DDR3 SDRAM. For processing, the buffered data is transmitted to a host computer via the introduced IP-core. This paper shows our implementation approach, achieved transmission rates, as well as other IP-core features and gives an outlook on future advancements and applications.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"1973 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130098015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FPGA implementation of a reconfigurable image encryption system","authors":"M. Ramírez-Torres, J. S. Ibarra, M. Mejía-Carlos","doi":"10.1109/ReConFig.2014.7032524","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032524","url":null,"abstract":"This paper presents the hardware implementation of a cryptographic algorithm using a matrix approach, named ESCA (encryption by synchronization of cellular automata). The ESCA system is a symmetric key algorithm, which is based on the rule-90 cellular automaton to implement all their components. In the hardware implementation of the ESCA system was used the soft processor core Microblaze in a Virtex-5 VC5VLX110T FPGA. With this embedded system we can have a reconfigurable encryption system that allows to the user select among different configurations without the need to program the FPGA every time. With the different options to configure the ESCA system, we may increase the security or have a customized system for a faster performance. With this implementation we carry out the encryption of grayscale and RGB color images. In addition, a sparse matrix format was implemented to reduce the latency, and to prove the security of the algorithm, different tests were applied exhibiting good results.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132141628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maik Ender, Gerd Duppmann, A. Wild, T. Pöppelmann, T. Güneysu
{"title":"A hardware-assisted proof-of-concept for secure VoIP clients on untrusted operating systems","authors":"Maik Ender, Gerd Duppmann, A. Wild, T. Pöppelmann, T. Güneysu","doi":"10.1109/ReConFig.2014.7032489","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032489","url":null,"abstract":"In this work we propose a secure architecture for Voice-over-IP (VoIP) that encapsulates all security and privacy critical components and I/O functions into secure hardware and thus drastically reduces the underlying trusted computing base. Our proof-of-concept implementation shows that high security and reliance on established standards and software (e.g., device drivers, transmission control, and protocols) to keep development costs down are no contradiction. Security is ensured as all security and privacy critical operations of the VoIP system are performed in protected hardware and as a consequence a successful attack on any software component (e.g., buffer overflow) does not lead to a violation of security. All I/O devices like microphones, speakers, displays, and dial buttons are directly connected to the secure hardware and cannot be controlled by an adversary even if the software part has been compromised.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"04 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129265439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A highly flexible reconfigurable system on a Xilinx FPGA","authors":"Tomas Drahonovsky, M. Rozkovec, O. Novák","doi":"10.1109/ReConFig.2014.7032531","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032531","url":null,"abstract":"Runtime reconfigurable systems become more prevalent in numerous practical applications because these systems have a great flexibility. This paper presents a reconfigurable system implemented on Xilinx Field Programmable Gate Array (FPGA) where partial bitstream relocation (PBR), configuration memory readback and internal registers restoration techniques are supported. It can reduce a number of partial bitstreams stored in memory, save the implementation time and generally increase the flexibility of the reconfigurable system. The article describes a relocatable system creation where the relocation procedure is based on the bitstream major address modifications and design where the relocation of individual modules including their internal states is supported.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125575178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On providing scalable self-healing adaptive fault-tolerance to RTR SoCs","authors":"Byron Navas, Johnny Öberg, I. Sander","doi":"10.1109/ReConFig.2014.7032541","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032541","url":null,"abstract":"The dependability of heterogeneous many-core FPGA based systems are threatened by higher failure rates caused by disruptive scales of integration, increased design complexity, and radiation sensitivity. Triple-modular redundancy (TMR) and run-time reconfiguration (RTR) are traditional fault-tolerant (FT) techniques used to increase dependability. However, hardware redundancy is expensive and most approaches have poor scalability, flexibility, and programmability. Therefore, innovative solutions are needed to reduce the redundancy cost but still preserve acceptable levels of dependability. In this context, this paper presents the implementation of a self-healing adaptive fault-tolerant SoC that reuses RTR IP-cores in order to self-assemble different TMR schemes during run-time. The presented system demonstrates the feasibility of the Upset-Fault-Observer concept, which provides a run-time self-test and recovery strategy that delivers fault-tolerance over functions accelerated in RTR cores, at the same time reducing the redundancy scalability cost by running periodic reconfigurable TMR scan-cycles. In addition, this paper experimentally evaluates the trade-off of the implemented reconfigurable TMR schemes by characterizing important fault tolerant metrics i.e., recovery time (self-repair and self-replicate), detection latency, self-assembly latency, throughput reduction, and increase of physical resources.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115251242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low power RAM-based hierarchical CAM on FPGA","authors":"Z. Qian, M. Margala","doi":"10.1109/ReConFig.2014.7032536","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032536","url":null,"abstract":"Content Addressable Memories (CAMs) have been widely used to implement various high speed search functions in network devices such as routers and servers. In these devices, the role of CAM is to classify, drop or forward internet packets (i.e., packet classification). However, CAM suffers from several shortcomings such as high power consumption and low integration density. In addition, CAM is not available in most of modern Field Programmable Gate Array (FPGA), which has broad applications in network infrastructures. Therefore RAM-based CAM emulation has emerged as a promising alternative to CAM not only because RAM is a relatively mature technology but also due to the fact that there are more and larger RAM blocks on modern FPGA. In this paper, we propose a hierarchical search scheme for RAM-based CAM on FPGA. If a match is found in previous blocks, no subsequent search will be triggered and therefore average power consumption is reduced. Comparing with previous works which have not employed this technique, simulation results show that our method could reduce the power consumption up to 11.0% and 9.7% for block RAM based and distributed RAM based implementation respectively.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116177864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The speed-up of detection of space debris using \"InterP\" and \"FLOPS2D\"","authors":"N. Fujita, T. Yanagisawa, H. Kurosaki, H. Oda","doi":"10.1109/ReConFig.2014.7032556","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032556","url":null,"abstract":"A new analysis method which can detect faint geostationary (GEO) objects has been proposed. The stacking method, which uses numerous CCD frames to detect objects below the background noise level, is widely used and works well, but has the drawback that detecting objects whose movements are unknown is extremely time-consuming. To overcome this, we developed a new algorithm which uses binarization of CCD images and calculates sum values instead of medians. A hardware realization of the algorithm was developed on an emulator \"InterP\" in preparation for being implemented in the \"FLOPS2D\" modular field programmable gate array (FPGA) system. When implemented in an FPGA, the algorithm reduces the analysis time by a factor of a fifty over a software implementation of the stacking method. We propose that FPGA acceleration is very effective when used in conjunction with modification of existing method to render them suitable for FPGA implementation.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126543661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vignesh Adhinarayanan, Thaddeus Koehn, Krzysztof Kepa, Wu-chun Feng, P. Athanas
{"title":"On the performance and energy efficiency of FPGAs and GPUs for polyphase channelization","authors":"Vignesh Adhinarayanan, Thaddeus Koehn, Krzysztof Kepa, Wu-chun Feng, P. Athanas","doi":"10.1109/ReConFig.2014.7032542","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032542","url":null,"abstract":"Wideband channelization is an important and computationally demanding task in the front-end subsystem of several software-defined radios (SDRs). The hardware that supports this task should provide high performance, consume low power, and allow flexible implementations. Several classes of devices have been explored in the past, with the FPGA proving to be the most popular as it reasonably satisfies all three requirements. However, the growing presence of low-power mobile GPUs holds much promise with improved flexibility for instant adaptation to different standards. Thus, in this paper, we present optimized polyphase channelizations for the FPGA and GPU, respectively, that must consider power and accuracy requirements in the context of a military application. The performance in mega-samples per second (MSPS) and energy efficiency in MSPS/watt are compared between the two classes of hardware platforms: FPGA and GPU. The results show that by exploiting the flexible datapath width of FPGAs, FPGA implementations generally deliver an order-of-magnitude better performance and energy efficiency over fixed-width GPU architectures.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130403518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tassadaq Hussain, Nehir Sönmez, Oscar Palomar, O. Unsal, A. Cristal, E. Ayguadé, M. Valero, Shakaib A. Gursal
{"title":"PAMS: Pattern Aware Memory System for embedded systems","authors":"Tassadaq Hussain, Nehir Sönmez, Oscar Palomar, O. Unsal, A. Cristal, E. Ayguadé, M. Valero, Shakaib A. Gursal","doi":"10.1109/ReConFig.2014.7032544","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032544","url":null,"abstract":"In this paper, we propose a hardware mechanism for embedded multi-core memory system called Pattern Aware Memory System (PAMS). The PAMS supports static and dynamic data structures using descriptors and specialized memory and reduces area, cost, energy consumption and hit latency. When compared with a Baseline Memory System, the PAMS consumes between 3 and 9 times and 1.13 and 2.66 times less program memory for static and dynamic data structures respectively. The benchmarking applications (having static and dynamic data structures) results show that PAMS consumes 20% less hardware resources, 32% less on chip power and achieves a maximum speedup of 52× and 2.9× for static and dynamic data structures respectively. The results show that the PAMS multi-core system transfers data structures up to 4.65× faster than the MicroBlaze baseline system.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128223162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}