{"title":"Proposition and evaluation of a real-time generic architecture for a laser stripe detection system on FPGA","authors":"Seher Colak, E. Dumas, V. Fresse, O. Alata","doi":"10.1109/DASIP.2017.8122110","DOIUrl":"https://doi.org/10.1109/DASIP.2017.8122110","url":null,"abstract":"Laser triangulation applications are commonly used for industrial quality control. Such algorithms require real-time systems often made of a computing unit close to the image sensor through a short and fast link. Choosing a camera with integrated Field Programmable Gate Array (FPGA) as the computing unit can provide high pipeline and parallel computing adapted to process image in real-time. Moreover, it is necessary in the industry to maintain code for several years whatever the system upgrade. So the conceived operators should be flexible to adapt to any hardware changes (sensor or FPGA) or any tool update with minimum effort. The purpose of this article is to present a generic architecture for laser stripe detection based on the centroid algorithm for a FPGA-based system. Evaluation of the use of resources with respect to two parameters (image width and parallelism) is pointed out. With three syntheses, models have been extracted to forecast evolution of these resources and an error analysis have been conducted to validate these models.","PeriodicalId":6637,"journal":{"name":"2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"160 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72718654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting data-parallel synchronous dataflow graphs","authors":"Sudeep Kanur, J. Lilius, Johan Ersfolk","doi":"10.1109/DASIP.2017.8122118","DOIUrl":"https://doi.org/10.1109/DASIP.2017.8122118","url":null,"abstract":"Synchronous Dataflow (SDF), a popular subset of the dataflow programming paradigm, gives a well structured formalism to capture signal and stream processing applications. With data-parallel architectures becoming ubiquitous, several frameworks leverage the SDF formalism to map applications to parallel architectures. But, these frameworks assume that the Synchronous Dataflow graphs (SDFGs) under consideration already are data-parallel. In this paper, we address the lack of mechanisms required to detect if an SDFG can be executed in a data-parallel fashion. We develop necessary and sufficient conditions that an SDFG must satisfy for its data-parallel execution. In addition, we develop methods that detect and transform SDFGs that cannot be determined to be data-parallel through visual graph inspection alone. We report on a prototype implementation of the developed conditions as a compiler pass in PREESM framework and test them against some useful applications expressed as an SDFG.","PeriodicalId":6637,"journal":{"name":"2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"31 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84290403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hardware-software abandoned object detection vision system in heterogeneous zynq device","authors":"T. Kryjak, Artur Skirzynski, M. Gorgon","doi":"10.1109/DASIP.2017.8122122","DOIUrl":"https://doi.org/10.1109/DASIP.2017.8122122","url":null,"abstract":"In this paper a hardware-software abandoned object detection vision system implemented in the Zynq SoC (System on Chip) device is presented. First, the solution was implemented in C++ and run as a bare metal application on the ARM processor core of the Zynq (using floating and fixed-point computations). For the target video stream 1280 χ 720 @ 50 fps (74.25 MHz pixel clock) it reached only 2 fps. Therefore, to speed-up the application, it was decided to move some of the image processing and analysis operations to the programmable logic. This allowed to obtain real-time image processing i.e. 50 fps, with power consumption of less than 4W.","PeriodicalId":6637,"journal":{"name":"2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"129 1","pages":"1-2"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74692953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Steffen Vaas, Peter Ulbrich, M. Reichenbach, D. Fey
{"title":"The best of both: High-performance anc deterministic real-time executive by application-specific multi-core SoCs","authors":"Steffen Vaas, Peter Ulbrich, M. Reichenbach, D. Fey","doi":"10.1109/DASIP.2017.8122107","DOIUrl":"https://doi.org/10.1109/DASIP.2017.8122107","url":null,"abstract":"Embedded multi-core processors improve performance significantly and are desirable in many application-fields. This in particular includes safety-critical real-time systems, which typically require a deterministic temporal behavior. However, even tasks without dependencies running on different cores can interfere due to, sometimes hidden, shared hardware resources, such as common memories or buses. Consequently, only a pessimistic assumption of the worst-case execution time (WCET) that incorporates interference can be given. Hence, the aspired performance gain fizzles out in the poor temporal analyzability. Based on the fact that in safety-critical systems all tasks and their dependencies are known at compile-time, this paper presents an approach to generate application-specific, deterministic multi-core processor architectures for these systems. Thereby safety-critical tasks are executed on dedicated Deterministic Execution Units (DEUs) including lightweight, deterministic processor cores, bus systems, memories and peripherals. The remaining soft real-time tasks are executed on a general purpose multi-core processor that offers performance over determinism. Consequently, timing analysis for hard real-time tasks is significantly simplified, since interferences caused by shared resources and scheduling are effectively eliminated. To show the benefits of our approach, an application-specific architecture for a flight controller was generated and compared to an ARM Cortex-A9 dual-core as reference. Overall, we were able to significantly improve temporal properties of safety-critical tasks while preserving the overall performance for soft real-time tasks.","PeriodicalId":6637,"journal":{"name":"2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"7 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78960169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Ibrahim, W. Simon, Damien Doy, E. Pignat, F. Angiolini, M. Arditi, J. Thiran, G. Micheli
{"title":"Single-FPGA complete 3D and 2D medical ultrasound imager","authors":"A. Ibrahim, W. Simon, Damien Doy, E. Pignat, F. Angiolini, M. Arditi, J. Thiran, G. Micheli","doi":"10.1109/DASIP.2017.8122113","DOIUrl":"https://doi.org/10.1109/DASIP.2017.8122113","url":null,"abstract":"3D ultrasound (US) acquisition acquires volumetric images, thus alleviating a classical US imaging bottleneck that requires a highly-trained sonographer to operate the US probe. However, this opportunity has not been explored in practice, since 3D US machines are only suitable for hospital usage in terms of cost, size and power requirements. In this work we propose the first fully-digital, single-chip 3D US imager on FPGA. The proposed design is a complete processing pipeline that includes pre-processing, image reconstruction, and post-processing. It supports up to 1024 input channels, which matches or exceeds state of the art, in an unprecedented estimated power budget of 6.1 W. The imager exploits a highly scalable architecture which can be either downscaled for 2D imaging, or further upscaled on a larger FPGA. Our platform supports both real-time inputs over an optical cable, or test data feeds sent by a laptop running Matlab and custom tools over an Ethernet connection. Additionally, the design allows HDMI video output on a screen.","PeriodicalId":6637,"journal":{"name":"2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"12 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75556644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Lieske, W. Uhring, N. Dumas, J. Léonard, D. Fey
{"title":"Embedded fluorescence lifetime determination for high throughput real-time droplet sorting with microfluidics","authors":"T. Lieske, W. Uhring, N. Dumas, J. Léonard, D. Fey","doi":"10.1109/DASIP.2017.8122129","DOIUrl":"https://doi.org/10.1109/DASIP.2017.8122129","url":null,"abstract":"Time-resolved fluorescence (TRF) analysis is considered to be among the primary research tools in biochemistry and biophysics. One application of this method is the investigation of biomolecular interactions with promising applications for biosensing. For the latter context, time-correlated single photon counting (TCSPC) is the most sensitive, hence preferred implementation of TRF. However, high throughput applications are presently limited by the maximum achievable photon acquisition rate, and even more by the data processing rate. The latter rate is actually limited by the computational complexity to estimate accurately the fluorescence lifetime from TCSPC data. Here we propose a solution that would enable the implementation of TRF detection for fluorescence-activated droplet sorting (FADS), a particularly high throughput, microfluidic-based technology. Most fluorescence lifetime algorithms require a large number of detected photons for an accurate lifetime computation. This paper presents an implementation based on a maximum likelihood estimator (MLE), enabling high precision estimation with a limited number of detected photons, significantly reducing the total measurement time. This speedup rapidly increases the input data rate. As a result, off-the-shelf embedded products cannot handle the data rates produced by current TCSPC units that are used to measure the fluorescence. Therefore, a configurable real-time capable hardware architecture is implemented on a field-programmable gate array (FPGA) that can handle the data rates of future TCSPC units, rendering high throughput droplet sorting with microfluidics possible.","PeriodicalId":6637,"journal":{"name":"2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"38 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75648080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seyyed Mahdi Najmabadi, Harsimran Singh Tungal, Trung-Hieu Tran, S. Simon
{"title":"Hardware-based architecture for asymmetric numeral systems entropy decoder","authors":"Seyyed Mahdi Najmabadi, Harsimran Singh Tungal, Trung-Hieu Tran, S. Simon","doi":"10.1109/DASIP.2017.8122109","DOIUrl":"https://doi.org/10.1109/DASIP.2017.8122109","url":null,"abstract":"In this paper, two novel hardware architectures based on tabled asymmetric numeral systems decoding algorithm are proposed. In the proposed architectures the decoding throughput is highly dependent on the how much the data is compressed at encoding time. The synthesis results presented here show that the throughput of the parallel architecture can reach up 200 MB/s. The benchmarks show that the parallel architecture that runs on Xilinx Kintex FPGA provides higher throughout in comparison with the same algorithm running on Core i3 CPU.","PeriodicalId":6637,"journal":{"name":"2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"84 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82406237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deepayan Bhowmik, Paulo Garcia, A. Wallace, Robert J. Stewart, G. Michaelson
{"title":"Power efficient dataflow design for a heterogeneous smart camera architecture","authors":"Deepayan Bhowmik, Paulo Garcia, A. Wallace, Robert J. Stewart, G. Michaelson","doi":"10.1109/DASIP.2017.8122128","DOIUrl":"https://doi.org/10.1109/DASIP.2017.8122128","url":null,"abstract":"Visual attention modelling characterises the scene to segment regions of visual interest and is increasingly being used as a pre-processing step in many computer vision applications including surveillance and security. Smart camera architectures are an emerging technology and a foundation of security and safety frameworks in modern vision systems. In this paper, we present a dataflow design of a visual saliency based camera architecture targeting a heterogeneous CPU+FPGA platform to propose a smart camera network infrastructure. The proposed design flow encompasses image processing algorithm implementation, hardware & software integration and network connectivity through a unified model. By leveraging the properties of the dataflow paradigm, we iteratively refine the algorithm specification into a deployable solution, addressing distinct requirements at each design stage: from algorithm accuracy to hardware-software interactions, real-time execution and power consumption. Our design achieved real-time run time performance and the power consumption of the optimised asynchronous design is reported at only 0.25 Watt. The resource usages on a Xilinx Zynq platform remains significantly low.","PeriodicalId":6637,"journal":{"name":"2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"79 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84108852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
O. Ibraheem, Arif Irwansyah, J. Hagemeyer, Mario Porrmann, U. Rückert
{"title":"Reconfigurable vision processing system for player tracking in indoor sports","authors":"O. Ibraheem, Arif Irwansyah, J. Hagemeyer, Mario Porrmann, U. Rückert","doi":"10.1109/DASIP.2017.8122114","DOIUrl":"https://doi.org/10.1109/DASIP.2017.8122114","url":null,"abstract":"In recent years, there has been an increasing growth of using vision-based systems for tracking the players in team sports to evaluate and enhance their performance. Vision-based player tracking has high computational demands since it requires processing of a huge amount of video data based on the utilization of multiple cameras with high resolution and high frame rates. In this paper, we present a reconfigurable system to track the players in indoor sports automatically without user interaction. The proposed system can process live video data streams from multiple cameras as well as offline data from recorded video files. FPGA technology is used to accelerate this player tracking system by implementing the video acquisition, video preprocessing, player segmentation, and team identification & player detection modules in hardware, realizing a real-time system. The teams are identified and the players' positions are detected based on the colors of their jerseys. The detection results are sent from the FPGA to the host-PC where the players are tracked. Our results show that the achieved average player detection rate is up to 95.5%. The proposed system can process live video data using two GigE Vision cameras with a resolution of 1392×1040 pixels and 30 fps for each camera. A speed-up of 20 is achieved compared to an OpenCV-based software implementation on a host-PC equipped with a 2.93 GHz Intel i7 CPU.","PeriodicalId":6637,"journal":{"name":"2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"28 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83509331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An overflow free fixed-point eigenvalue decomposition algorithm: Case study of dimensionality reduction in hyperspectral images","authors":"Bibek Kabi, Anand S. Sahadevan, Tapan Pradhan","doi":"10.1109/DASIP.2017.8122131","DOIUrl":"https://doi.org/10.1109/DASIP.2017.8122131","url":null,"abstract":"We consider the problem of enabling robust range estimation of eigenvalue decomposition (EVD) algorithm for a reliable fixed-point design. The simplicity of fixed-point circuitry has always been so tempting to implement EVD algorithms in fixed-point arithmetic. Working towards an effective fixed-point design, integer bit-width allocation is a significant step which has a crucial impact on accuracy and hardware efficiency. This paper investigates the shortcomings of the existing range estimation methods while deriving bounds for the variables of the EVD algorithm. In light of the circumstances, we introduce a range estimation approach based on vector and matrix norm properties together with a scaling procedure that maintains all the assets of an analytical method. The method could derive robust and tight bounds for the variables of EVD algorithm. The bounds derived using the proposed approach remain same for any input matrix and are also independent of the number of iterations or size of the problem. Some benchmark hyperspectral data sets have been used to evaluate the efficiency of the proposed technique. It was found that by the proposed range estimation approach, all the variables generated during the computation of Jacobi EVD is bounded within ±1.","PeriodicalId":6637,"journal":{"name":"2017 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"15 1","pages":"1-9"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79128930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}