2022 IEEE International Workshop on Rapid System Prototyping (RSP)最新文献

TernaryNeRF: Quantizing Voxel Grid-based NeRF Models 量化基于体素网格的NeRF模型

2022 IEEE International Workshop on Rapid System Prototyping (RSP) Pub Date : 2022-10-13 DOI: 10.1109/RSP57251.2022.10039009

Seungyeop Kang, S. Yoo

引用次数: 0

Enhancing embedded AI-based object detection using multi-view approach 利用多视图方法增强嵌入式人工智能目标检测

2022 IEEE International Workshop on Rapid System Prototyping (RSP) Pub Date : 2022-10-13 DOI: 10.1109/RSP57251.2022.10039026

Z. Ning, Mostafa Rizk, A. Baghdadi, J. Diguet

引用次数: 0

A framework that enables systematic analysis of mixed-signal applications on FPGA 一个能够在FPGA上系统分析混合信号应用的框架

2022 IEEE International Workshop on Rapid System Prototyping (RSP) Pub Date : 2022-10-13 DOI: 10.1109/RSP57251.2022.10039031

Gabriel Rutsch, Maximilian Groebner, A. Sanders, Konrad Maier, W. Ecker

引用次数: 1

Automatically Restructuring HDL Modules for Improved Reusability in Rapid Synthesis 自动重组HDL模块以提高快速合成中的可重用性

2022 IEEE International Workshop on Rapid System Prototyping (RSP) Pub Date : 2022-10-13 DOI: 10.1109/RSP57251.2022.10039003

Jakob Wenzel, C. Hochberger

{"title":"Automatically Restructuring HDL Modules for Improved Reusability in Rapid Synthesis","authors":"Jakob Wenzel, C. Hochberger","doi":"10.1109/RSP57251.2022.10039003","DOIUrl":"https://doi.org/10.1109/RSP57251.2022.10039003","url":null,"abstract":"Implementing nontrivial HDL designs can take a lot of time. Particularly for FPGAs, vendor tools tend to become slower, since the devices grow and thus, also the designs grow. It is therefore desirable to create mechanisms that speed up the implementation. Combining pre-implemented blocks to build the final design can be one such mechanism. It can help to reduce the time required for incremental builds, or it can reduce the time required to build families of designs. Yet, typical HDL code is not structured for this purpose. Many modules do not have the right size to be used as pre-implemented blocks. In this paper, we present a methodology to automatically analyze and modify existing HDL code such that the resulting module structure fits the purpose of pre-implementing the modules. To this end, we try to isolate parameters of the HDL code such that we have to reimplement only a small number of modules after a parameter change. The resulting tool is available as open-source software. We have tested our methodology using multiple different benchmark sets, which in total contain thousands of modules. On average, we can extract around 10% of the parameters into smaller modules.","PeriodicalId":201919,"journal":{"name":"2022 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122355513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine Learning-Based Hard/Soft Logic Trade-offs in VTR 基于机器学习的VTR硬/软逻辑权衡

2022 IEEE International Workshop on Rapid System Prototyping (RSP) Pub Date : 2022-10-13 DOI: 10.1109/RSP57251.2022.10039002

Ritwik Sinha, S. A. Damghani, K. Kent

{"title":"Machine Learning-Based Hard/Soft Logic Trade-offs in VTR","authors":"Ritwik Sinha, S. A. Damghani, K. Kent","doi":"10.1109/RSP57251.2022.10039002","DOIUrl":"https://doi.org/10.1109/RSP57251.2022.10039002","url":null,"abstract":"Circuit optimization, in any application, is of high importance since it not only improves the efficiency of the intended purpose but also enhances the quality of the final product. It enables the circuit designer to cater to the specific needs of the customer. For circuit optimization to occur, we need to elaborate these circuits on a primary level and perform synthesis operations. Previous research shows that the investigation of improvements to different Hardware Description Language (HDL) elaboration phases, was completely closed source. Verilog To Routing (VTR) is an open-source Electronic Design Automation (EDA) tool. ODIN II is the VTR synthesizer that parses the input Verilog, elaborates its Abstract Syntax Tree (AST), performs the partial mapping according to the architecture file, and performs optimizations such as unused logic removal. To that end, the hard versus soft logic trade-off aims to optimize the performance of the circuit. This project focuses on using machine learning approaches to make synthesis tools intelligent enough to decide this ratio on their own, without the need for human intervention, and based on some predefined criteria. This paper discusses the criteria for having less latency or less critical path delay in the circuit. Also, it aims at providing this level of intelligence at an earlier stage in the VTR pipeline to make better use of this information.","PeriodicalId":201919,"journal":{"name":"2022 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114726688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ANN-based Performance Estimation of Embedded Software for RISC-V Processors 基于神经网络的RISC-V处理器嵌入式软件性能评估

2022 IEEE International Workshop on Rapid System Prototyping (RSP) Pub Date : 2022-10-13 DOI: 10.1109/RSP57251.2022.10039004

Weiyan Zhang, Mehran Goli, Alireza Mahzoon, R. Drechsler

{"title":"ANN-based Performance Estimation of Embedded Software for RISC-V Processors","authors":"Weiyan Zhang, Mehran Goli, Alireza Mahzoon, R. Drechsler","doi":"10.1109/RSP57251.2022.10039004","DOIUrl":"https://doi.org/10.1109/RSP57251.2022.10039004","url":null,"abstract":"The demand for optimized and efficient embedded software is increasing in many applications such as the Internet of Things (IoT) or other Cyber-Physical Systems (CPS). Hence, early performance analysis of embedded software is essential to perform Design Space Exploration (DSE), ensure efficiency, and meet time-to-market constraints. Designers usually use real hardware, simulators, or static analyzers to obtain the performance. However, these methods suffer from serious drawbacks as real hardware is not available in the early stage of the design process, simulators either do not support any timing accuracy or require large execution time, and static analyzers need details of the hardware microarchitecture. In this paper, we present a novel Artificial Neural Network (ANN)-based approach that allows a fast and accurate performance estimation of embedded software for RISC-V processors in the early design phases. This can significantly reduce the burden on designers to perform DSE. The proposed approach takes advantage of the dynamic analysis technique and analytical models and does not require any microarchitecture-related parameters such as cache misses, cache hits, and memory-level parallelism. We compare our proposed microarchitecture-independent approach with state-of-the-art in terms of speed and accuracy. Our experiments on various benchmarks demonstrate that the proposed approach achieves a speed-up of $4.41times$ compared to a RISC-V Virtual Prototype (VP) at the Electronic System Level (ESL), while the estimation results have only a Mean Absolute Percentage Error (MAPE) of 2%.","PeriodicalId":201919,"journal":{"name":"2022 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126998875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Case for Second-Level Software Cache Coherency on Many-Core Accelerators 多核加速器上二级软件缓存一致性的一个案例

2022 IEEE International Workshop on Rapid System Prototyping (RSP) Pub Date : 2022-10-13 DOI: 10.1109/RSP57251.2022.10038999

Arthur Vianès, F. Pétrot, F. Rousseau

{"title":"A Case for Second-Level Software Cache Coherency on Many-Core Accelerators","authors":"Arthur Vianès, F. Pétrot, F. Rousseau","doi":"10.1109/RSP57251.2022.10038999","DOIUrl":"https://doi.org/10.1109/RSP57251.2022.10038999","url":null,"abstract":"Cache and cache-coherence are major aspects of today's high performance computing. A cache stores data as cache-lines of fixed size, and coherence between caches is guaranteed by the cache-coherence protocol which operates on fixed size coherency-blocks. In such systems cache-lines and coherency-blocks are usually the same size and are relatively small, typically 64 bytes. This size choice is a trade-off selected for general-purpose computing: it minimizes false-sharing while keeping cache-maintenance traffic low. False-sharing is considered an unnecessary cache-coherence traffic and it decreases performances. However, for dedicated accelerator this trade-off may not be appropriate: hardware in charge of cache-coherence is expensive and not well exploited by most accelerator applications as by construction these applications minimize false-sharing. This paper investigates the possibility of an alternative trade-off of cache-coherency and cache-maintenance block size for many-core accelerators, by decoupling coherency-block and cache-lines sizes. Interests, advantages and difficulties are presented and discussed in this paper. Then we also discuss needs of software and hardware modifications in prototypes and the capability of such prototypes to evaluate different coherence-block sizes.","PeriodicalId":201919,"journal":{"name":"2022 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120976749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Marine Objects Detection Using Deep Learning on Embedded Edge Devices 基于嵌入式边缘设备的深度学习海洋目标检测

2022 IEEE International Workshop on Rapid System Prototyping (RSP) Pub Date : 2022-10-13 DOI: 10.1109/RSP57251.2022.10039025

Dominique Heller, Mostafa Rizk, R. Douguet, A. Baghdadi, J. Diguet

{"title":"Marine Objects Detection Using Deep Learning on Embedded Edge Devices","authors":"Dominique Heller, Mostafa Rizk, R. Douguet, A. Baghdadi, J. Diguet","doi":"10.1109/RSP57251.2022.10039025","DOIUrl":"https://doi.org/10.1109/RSP57251.2022.10039025","url":null,"abstract":"Artificial Intelligence techniques based on convolution neural networks (CNNs) are now dominant in the field of object detection and classification. The deployment of CNNs on embedded edge devices targeting real-time inference sets a challenge due to the limited computing resources and power budgets. Several optimization techniques such as pruning, quantization and use of light neural networks enable the real-time inference but at the cost of precision degradation. However, using efficient approaches to apply the optimization techniques at training and inference stages enable high inference speed with limited degradation of detection performance. In this paper, we revisit the problem of detecting and classifying maritime objects. We investigate different versions of the You Only Look Once (YOLO), a state-of-the-art deep neural network, for real-time object detection and compare their performance for the specific application of detecting maritime objects. The trained YOLO networks are efficiently optimized targeting three recent edge devices: Nvidia Jetson Xavier AGX, AMD-Xilinx Kria KV260 Vision AI Kit, and Movidius Myriad X VPU. The proposed deployments demonstrate promising results with an inference speed of 90 FPS and a limited degradation of 2.4% in mean average precision.","PeriodicalId":201919,"journal":{"name":"2022 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132446171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Early prototyping and testing of CERN LHC CMS high-granularity calorimeter slow-control system CERN大型强子对撞机CMS高粒度量热计慢控系统的早期原型设计与测试

2022 IEEE International Workshop on Rapid System Prototyping (RSP) Pub Date : 2022-10-13 DOI: 10.1109/RSP57251.2022.10039014

Martim Rosado, S. Mallios, P. Tomás, N. Roma, A. David

{"title":"Early prototyping and testing of CERN LHC CMS high-granularity calorimeter slow-control system","authors":"Martim Rosado, S. Mallios, P. Tomás, N. Roma, A. David","doi":"10.1109/RSP57251.2022.10039014","DOIUrl":"https://doi.org/10.1109/RSP57251.2022.10039014","url":null,"abstract":"The Compact Muon Solenoid (CMS) high-granularity calorimeter (HGCAL) upgrade for CERN's Large Hadron Collider (LHC) high-luminosity phase is a detector with more than 6 million channels that will provide precise sensing and measurement of position, timing, and energy of the particles produced in the collisions of the beams. The HGCAL electronics are a large and complex set of processing systems split into front-end and back-end. The front-end, located in the experimental cavern, consists of $boldsymbol{approx 150}$ thousand radiation tolerant ASICs. The high-density FPGA-based back-end is housed away from the radiation area in a set of Advanced Telecommunications Computing Architecture (ATCA) boards and crates hosting $boldsymbol{approx 100}$ FPGAs. Each ATCA back-end board will comprise one (or two) FPGAs, managing up to $boldsymbol{approx 120}$ optical links, each providing a transmission rate of 10.24 Gb/s between the back-end and the front-end electronics. Each back-end FPGA is responsible for configuring and monitoring up to $boldsymbol{approx 3500}$ front-end ASICs and will be controlled by software running on a back-end MPSoC that provides the entry point for the whole control procedure. This paper presents the design and implementation of the prototyping infrastructure deployed to test and validate the slow-control block of the HGCAL back-end electronics, together with the related interfaces with the controller MPSoC and the front-end transceiver ASICs. The required functionalities have been validated with a ZCU102 Xilinx Ultrascale+ development board, which emulated the back-end elements that are still under development and not yet available for this comprehensive test. This development board was connected to other custom ASIC development boards via optical links, emulating the front-end side of the system, also still under development. Besides providing reliable testing and validation of the operation of the whole infrastructure, the prototyping platform also allowed to attain the required software/hardware portability that ensures easy integration/replacement of all the (still) emulated components with their final implementations.","PeriodicalId":201919,"journal":{"name":"2022 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122421126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0