Workshop on Design and Architectures for Signal and Image Processing (14th edition)最新文献

Hardware-software implementation of the PointPillars network for 3D object detection in point clouds 点云中三维目标检测的PointPillars网络的硬件软件实现

Workshop on Design and Architectures for Signal and Image Processing (14th edition) Pub Date : 2021-01-18 DOI: 10.1145/3441110.3441150

Joanna Stanisz, K. Lis, T. Kryjak, M. Gorgon

引用次数: 1

Convolutional Fully-Connected Capsule Network (CFC-CapsNet) 卷积全连接胶囊网络(CFC-CapsNet)

Workshop on Design and Architectures for Signal and Image Processing (14th edition) Pub Date : 2021-01-18 DOI: 10.1145/3441110.3441148

Pouya Shiri, A. Baniasadi

引用次数: 5

Automotive perception system evaluation with reference data obtained by a UAV 利用无人机获取的参考数据对汽车感知系统进行评估

Workshop on Design and Architectures for Signal and Image Processing (14th edition) Pub Date : 2021-01-18 DOI: 10.1145/3441110.3441151

Krzysztof Błachut, M. Danilowicz, Hubert Szolc, Mateusz Wasala, T. Kryjak, Nikodem Pankiewicz, M. Komorkiewicz

引用次数: 2

Low-Power Sign-Magnitude FFT Design for FMCW Radar Signal Processing FMCW雷达信号处理的低功率信号幅值FFT设计

Workshop on Design and Architectures for Signal and Image Processing (14th edition) Pub Date : 2021-01-18 DOI: 10.1145/3441110.3441145

O. Meteer, M. Bekooij

引用次数: 1

On Cache Limits for Dataflow Applications and Related Efficient Memory Management Strategies 数据流应用的缓存限制及相关的高效内存管理策略

Workshop on Design and Architectures for Signal and Image Processing (14th edition) Pub Date : 2021-01-18 DOI: 10.1145/3441110.3441573

Alemeh Ghasemi, R. Cataldo, J. Diguet, Kevin J. M. Martin

{"title":"On Cache Limits for Dataflow Applications and Related Efficient Memory Management Strategies","authors":"Alemeh Ghasemi, R. Cataldo, J. Diguet, Kevin J. M. Martin","doi":"10.1145/3441110.3441573","DOIUrl":"https://doi.org/10.1145/3441110.3441573","url":null,"abstract":"The dataflow paradigm frees the designer to focus on the functionality of an application, independently from the underlying architecture executing it. While mapping the dataflow computational part to the cores seems obvious, the memory aspects do not match accordingly. Dataflow compilers usually do not consider the presence of caches when generating code. A generally accepted idea is that bigger and multi-level caches improve the performance of applications. Unfortunately, state-of-the-art dataflow compilers may prove the exception to this rule. This paper presents two efficient memory management strategies for dataflow applications through a study on the impact of sharing, size, and the number of levels of caches on them. The results show that bigger is not always better, and the foreseen future of more cores and bigger caches do not guarantee software-free better performance for dataflow applications. We propose two strategies, that can be used concurrently, to address the memory aspects of the dataflow model: copy-on-write and non-temporal memory transfers. Experimental results show that we speed up a computer stereo vision application by 2.1 × and reduce the number of L1 data cache misses by 45% while maintaining the actors’ source code and design intact.","PeriodicalId":398729,"journal":{"name":"Workshop on Design and Architectures for Signal and Image Processing (14th edition)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121667171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multiple Transform Selection concept modeling and implementation using Interface Based SDF graphs 使用基于接口的SDF图的多重变换选择概念建模和实现

Workshop on Design and Architectures for Signal and Image Processing (14th edition) Pub Date : 2021-01-18 DOI: 10.1145/3441110.3441153

Naouel Haggui, Fatma Belghith, W. Hamidouche, N. Masmoudi, J. Nezan

{"title":"Multiple Transform Selection concept modeling and implementation using Interface Based SDF graphs","authors":"Naouel Haggui, Fatma Belghith, W. Hamidouche, N. Masmoudi, J. Nezan","doi":"10.1145/3441110.3441153","DOIUrl":"https://doi.org/10.1145/3441110.3441153","url":null,"abstract":"Recent studies predict that video data accounts for 82% of Internet traffic by 2022. This fact has motivated MPEG to define a new Video Coding Standard called Versatile Video Coding (VVC), which will be released by the end of 2020. VVC will offer the possibility to handle new video formats and to improve significantly video compression over its predecessor HEVC. Indeed, the objective is to reduce the necessary bit rate by half, at equivalent quality. These advances require the use of more complex algorithms, although the increase in complexity has been limited throughout the standardization process. In order to decrease the complexity of VVC and consequently the coding execution time, several methods have been introduced at different stages of the encoder. The aim of this paper is to explore the available parallelism of VVC to accelerate the coding and the decoding processes. This paper focuses on the transformation block and more specifically the new concept of Multiple Transform Selection (MTS) introduced by VVC. Moreover, a study of several granularity levels of Interface-Based Synchronous Dataflow (IBSDF) models and their impact on the performances obtained on x86 architectures is presented. IBSDF dataflow graph has been developed to reveal the available parallelism of MTS. The PREESM fast prototyping tool is then used for the mapping and the scheduling of MTS on virtual and real parallel architectures and for generating efficient parallel implementations on real architectures. PREESM has been used in this work to explore the potential parallelism offered by MTS and to prove the efficiency of MTS on multicore x86 architectures. Experimental results show a speed-up close to the optimum.","PeriodicalId":398729,"journal":{"name":"Workshop on Design and Architectures for Signal and Image Processing (14th edition)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131931103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Gegelati: Lightweight Artificial Intelligence through Generic and Evolvable Tangled Program Graphs Gegelati:通过通用和进化的纠结程序图的轻量级人工智能

Workshop on Design and Architectures for Signal and Image Processing (14th edition) Pub Date : 2020-12-15 DOI: 10.1145/3441110.3441575

K. Desnos, Nicolas Sourbier, Pierre-Yves Raumer, Olivier Gesny, M. Pelcat

{"title":"Gegelati: Lightweight Artificial Intelligence through Generic and Evolvable Tangled Program Graphs","authors":"K. Desnos, Nicolas Sourbier, Pierre-Yves Raumer, Olivier Gesny, M. Pelcat","doi":"10.1145/3441110.3441575","DOIUrl":"https://doi.org/10.1145/3441110.3441575","url":null,"abstract":"Tangled Program Graph (TPG) is a reinforcement learning technique based on genetic programming concepts. On state-of-the-art learning environments, TPGs have been shown to offer comparable competence with Deep Neural Networks (DNNs), for a fraction of their computational and storage cost. This lightness of TPGs, both for training and inference, makes them an interesting model to implement Artificial Intelligences (AIs) on embedded systems with limited computational and storage resources. In this paper, we introduce the Gegelati library for TPGs. Besides introducing the general concepts and features of the library, two main contributions are detailed in the paper: 1/ The parallelization of the deterministic training process of TPGs, for supporting heterogeneous Multiprocessor Systems-on-Chipss (MPSoCss). 2/ The support for customizable instruction sets and data types within the genetically evolved programs of the TPG model. The scalability of the parallel training process is demonstrated through experiments on architectures ranging from a high-end 24-core processor to a low-power heterogeneous MPSoCs. The impact of customizable instructions on the outcome of a training process is demonstrated on a state-of-the-art reinforcement learning environment.","PeriodicalId":398729,"journal":{"name":"Workshop on Design and Architectures for Signal and Image Processing (14th edition)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126202449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9