{"title":"A reconfigurable general framework for pipelined image processing: A color mathematical morphology application","authors":"E. C. Pedrino, J. H. Saito, V. O. Roda","doi":"10.1109/SPL.2010.5483023","DOIUrl":"https://doi.org/10.1109/SPL.2010.5483023","url":null,"abstract":"In this paper a reconfigurable general framework for pipelined fast image processing is presented The system is capable of processing, in real time, video with resolution of 640 × 480 pixels at the speed of 60 frames per second. The video is supplied to the framework by means of a color video camera. As an application of the framework, the basic operators of mathematical morphology are implemented on the RGB color space. The processed images can be displayed by any display system compatible with the standard composite video. The results obtained using the developed framework are compared with a Matlab implementation.","PeriodicalId":372692,"journal":{"name":"2010 VI Southern Programmable Logic Conference (SPL)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126694525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Architecture for binary mathematical morphology reconfigurable by genetic programming","authors":"E. C. Pedrino, J. H. Saito, V. O. Roda","doi":"10.1109/SPL.2010.5483033","DOIUrl":"https://doi.org/10.1109/SPL.2010.5483033","url":null,"abstract":"Mathematical morphology supplies powerful tools for low level image analysis, with applications in robotic vision, visual inspection, medicine, texture analysis and many other areas. Many of the mentioned applications require dedicated hardware for real time execution. In this paper, the development of a novel reconfigurable hardware using logical and morphological instructions generated automatically by a linear approach based on genetic programming is proposed. The hardware is capable of processing binary images at high speed. The developed system is based on high-capacity PLDs and has among the possible applications: automatic construction of image filters, intelligent pattern recognition, to name just a few. Some applications using the developed reconfigurable system are presented and the results are discussed and compared with other approaches.","PeriodicalId":372692,"journal":{"name":"2010 VI Southern Programmable Logic Conference (SPL)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123630233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francisco Dias de Souza Júnior, J. L. Silva, Lucas Sanches, V. Astolfi
{"title":"Research and Partial analysis of overhead of a partition model for a Partially Reconfigurable hardware in a data-driven machine - chicflow","authors":"Francisco Dias de Souza Júnior, J. L. Silva, Lucas Sanches, V. Astolfi","doi":"10.1109/SPL.2010.5483013","DOIUrl":"https://doi.org/10.1109/SPL.2010.5483013","url":null,"abstract":"Computer applications have become increasingly more complex and require greater processing capacity. In order to achieve higher performance for these applications, algorithms are often implemented in Field-Programmable Gate Arrays (FPGAs). However, most of the Computer-Aided Design (CAD) tools still uses Hardware Description Languages (HDL), which are complex if compared with imperative High Level Languages (HLL). ChipCflow is a tool that aims to convert HLL to HDL, using the dynamic dataflow model and Active Partial Reconfiguration (APR). In this paper we present a research report for the hardware architecture's partition model, necessary for the correct allocation of Dataflow Graphs (DFGs) into FPGA's fabric using APR. In order to calculate system's logic overhead, we show some results which denotes a ratio between the operator's logic and the necessary reconfigurable area, as well as some guidelines about the reconfiguration time of these reconfigurable areas.","PeriodicalId":372692,"journal":{"name":"2010 VI Southern Programmable Logic Conference (SPL)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122893177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using an FPGA digital clock manager to generate sub-nanosecond phase shifts for lidar applications","authors":"William T. Gaughan, Brian Butka","doi":"10.1109/SPL.2010.5483021","DOIUrl":"https://doi.org/10.1109/SPL.2010.5483021","url":null,"abstract":"When using a Light Detection and Ranging (LIDAR) gun for civil engineering applications, anomalous data were observed. A laboratory system was developed to simulate ranging to moving targets in a controlled environment in order to study the anomalous data and the performance of the LIDAR gun. The laboratory system was found to require the ability to generate sub-nanosecond phase shifts which can be updated in real time. This capability is available in high-quality laboratory, but the required equipment was not available to the authors. This research examines the design and implementation of a low cost system. The final design uses an embedded processor for computation of the necessary phase shifts and possible anomalies and an FPGA system to generate the dynamic precision phase shifts.","PeriodicalId":372692,"journal":{"name":"2010 VI Southern Programmable Logic Conference (SPL)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125017145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guilherme Perin, D. Mesquita, F. Herrmann, J. B. Martins
{"title":"Montgomery modular multiplication on reconfigurable hardware: Fully systolic array vs parallel implementation","authors":"Guilherme Perin, D. Mesquita, F. Herrmann, J. B. Martins","doi":"10.1109/SPL.2010.5483003","DOIUrl":"https://doi.org/10.1109/SPL.2010.5483003","url":null,"abstract":"This paper describes a comparison of two FPGA Montgomery modular multiplication architectures: a fully systolic array and a parallel implementation. The modular multiplication is employed in modular exponentiation processes, which is the most important operation of some public-key cryptographic algorithms and the most popular of them is the RSA encryption scheme. The proposed fully systolic array architecture presents a high-radix implementation with carry propagation between the Processing Elements. The parallel implementation is composed by multipliers blocks in parallel with the Processing Elements and it provides a pipelined operation mode. We compared the time x area efficiency for both architectures as well as a RSA application. The fully systolic array implementation can run the 1024 bit RSA decryption process in just 3.23 ms and the parallel architecture executes the same operation in 6 ms, which means a competitive state-of-art performance for both architectures.","PeriodicalId":372692,"journal":{"name":"2010 VI Southern Programmable Logic Conference (SPL)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121597513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FPGA-based smart sensor implementation with precise frequency to digital converter for flow measurement","authors":"Edval J. P. Santos, Leonardo B. M. Silva","doi":"10.1109/SPL.2010.5483009","DOIUrl":"https://doi.org/10.1109/SPL.2010.5483009","url":null,"abstract":"The realization of integrated frequency-based smart sensor for flow measurement requires a precise frequency to digital converter. A VHDL-based implementation of such converter for a royalty-free solution with 1 ppm resolution is reported. This work is part of a correlator under development to measure total flow of multiphase fluids. This intellectual property block can also be used with other frequency encoded transducers. The converter has been prototyped with a Xilinx™ XC3S500E Spartan-3E FPGA, and has been tested up to 10MHz.","PeriodicalId":372692,"journal":{"name":"2010 VI Southern Programmable Logic Conference (SPL)","volume":"239 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133426483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementation of Split-Radix Fast Fourier Transform on FPGA","authors":"Cynthia Watanabe, C. Silva, J. Muñoz","doi":"10.1109/SPL.2010.5483018","DOIUrl":"https://doi.org/10.1109/SPL.2010.5483018","url":null,"abstract":"Nowadays, portable systems are developed especially for signal processing, where the principal challenge is to find circuits with less area and power consumption. One of the most powerful tools in the area of Signal Processing is the Fast Fourier Transform (FFT). Many algorithms have been developed to improve its computation time; one of them is the Split Radix Fast Fourier Transform (SRFFT) which reduces the number of complex computation. Therefore, a new architecture is proposed to compute the SRFFT. Although the runtime of this design is high, it has some important profits like a flexible number of inputs N=2P; few resources required such as combinational functions, logic registers and memory.","PeriodicalId":372692,"journal":{"name":"2010 VI Southern Programmable Logic Conference (SPL)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130117487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Veitch, Louis-Marie Aubert, R. Woods, S. Fischaber
{"title":"Acceleration of HMM-based speech recognition system by parallel FPGA Gaussian calculation","authors":"R. Veitch, Louis-Marie Aubert, R. Woods, S. Fischaber","doi":"10.1109/SPL.2010.5483010","DOIUrl":"https://doi.org/10.1109/SPL.2010.5483010","url":null,"abstract":"An FPGA-based custom core which computes the Gaussian calculation portion of a Hidden Markov Model (HMM) based speech recognition system, is presented. The work is part of the development of a custom embedded system which will provide speaker independend, large vocabulary continuos speech recognition and is currently presented as a hardware/software codesign. By de-coupling the Gaussian calculation from the backend search, calculation of Gaussian results is performed with minimal communication between backend search software and an FPGA based Gaussian core. Several implementations have been investigated in order to minimize memory bandwidth and FPGA resource requirements and are presented. The system has been implemented using an Alpha Data XCR-5T1, reconfigurable computer housing a Virtex 5 SX95T FPGA and has achieved better than real-time performance at 133MHz. The core has been tested and is capable of calculating a full set of Gaussian results from 3825 acoustic models in 5.3ms which coupled with a backend search of 5000 words has provided over 80% accuracy.","PeriodicalId":372692,"journal":{"name":"2010 VI Southern Programmable Logic Conference (SPL)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115518021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Sanchez, F. Sampaio, Robson Dornelles, L. Agostini
{"title":"Efficiency evaluation and architecture design of SSD unities for the H.264/AVC standard","authors":"G. Sanchez, F. Sampaio, Robson Dornelles, L. Agostini","doi":"10.1109/SPL.2010.5483019","DOIUrl":"https://doi.org/10.1109/SPL.2010.5483019","url":null,"abstract":"This paper presents the design and evaluation of architectures that performs the SSD (Sum of Squared Differences) similarity criterion calculation. The comparison was made with other widely used criterion: the SAD (Sum of Absolute Differences). In order to compare the impact of both criteria in the coding process, a set of executions using the JM 16.0 reference software were performed. In these tests, SSD almost ever got a better video quality than SAD. Three architectures are proposed to perform SSD: (a) the first one uses a multiplexer, (b) the second uses a memory and (c) the last one uses a dedicated multiplier. One architecture to perform SAD is proposed to be compared with the architectures using SSD. Each solution was described in VHDL and synthesized to an Altera Stratix II FPGA. The video quality gain using SSD over the SAD encourages the use of SSD calculators even with a lower operation frequency when compared with an SAD implementation. In the best case and considering HDTV 1080p videos (1920×1080 pixels), it is possible to reach real time processing (30 frames per second) by putting 12 SSD calculators working in parallel.","PeriodicalId":372692,"journal":{"name":"2010 VI Southern Programmable Logic Conference (SPL)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124581227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Corrêa, M. T. Schoenknecht, Robson Dornelles, L. Agostini
{"title":"A high performance hardware architecture for the H.264/AVC half-pixel interpolation unit","authors":"M. Corrêa, M. T. Schoenknecht, Robson Dornelles, L. Agostini","doi":"10.1109/SPL.2010.5482998","DOIUrl":"https://doi.org/10.1109/SPL.2010.5482998","url":null,"abstract":"This work presents a high performance half pixel interpolation unit for the H.264/AVC standard. The presented architecture is able to process very high definition videos (3840 × 2048 pixels) at real time processing (30 frames per second), and can be integrated in a complete motion estimation architecture without limiting the other modules' performance. It also presents a novel arrangement of interpolated samples which makes simple the search for the best fractional motion vector. The architecture was described in VHDL and synthesized to a Xilinx Virtex4 FPGA, and it achieved the best results when compared to related works published in the literature.","PeriodicalId":372692,"journal":{"name":"2010 VI Southern Programmable Logic Conference (SPL)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125452184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}