Loïc France, Florent Bruguier, M. Mushtaq, D. Novo, P. Benoit
{"title":"Implementing Rowhammer Memory Corruption in the gem5 Simulator","authors":"Loïc France, Florent Bruguier, M. Mushtaq, D. Novo, P. Benoit","doi":"10.1109/RSP53691.2021.9806242","DOIUrl":"https://doi.org/10.1109/RSP53691.2021.9806242","url":null,"abstract":"Modern computer memories have shown to have reliability issues. The main memory is the target of a security threat called Rowhammer, which causes bit flips in adjacent victim cells of aggressor rows. Numerous countermeasures have been proposed, some of the most efficient ones relying on memory controller modifications, which make them non-integrable in existing systems. These solutions have to be effective against attacks on current and future architectures and technology nodes. In order to prove the efficiency of such mitigation techniques, we have to use simulation platforms. Unfortunately, existing architecture simulators do not provide any implementation of unintended memory modifications like bit-flips. Integrating memory corruption into architecture simulators would allow the construction of attacks and mitigations for current and future computers, using feedback from the simulator. In this paper, we propose an implementation of the Rowhammer effect in the gem5 architecture simulator, demonstrate its capabilities and state its limitations.","PeriodicalId":229411,"journal":{"name":"2021 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124692055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Magalhães, M. Nikdast, Fabiano Hessel, O. Liboiron-Ladouceur, G. Nicolescu
{"title":"HyCo: A Low-Latency Hybrid Control Plane for Optical Interconnection Networks","authors":"F. Magalhães, M. Nikdast, Fabiano Hessel, O. Liboiron-Ladouceur, G. Nicolescu","doi":"10.1109/RSP53691.2021.9806198","DOIUrl":"https://doi.org/10.1109/RSP53691.2021.9806198","url":null,"abstract":"Next-generation multiprocessor systems point to the integration of a large number of cores (e.g., processing and memory) where electrical networks-on-chip (eNoCs) can improve the communication performance. As the number of integrated cores increases, metallic interconnect in eNoCs becomes a bottleneck, leading to communication performance degradation and increased power consumption. Optical interconnection networks (OINs) have emerged to outperform the communication infrastructure in multiprocessor systems. Nevertheless, OINs’ full capability is curbed by high latency electrical controllers required to orchestrate and (re)configure the underlying photonic components, realizing a path between sending and receiving cores. Control techniques impose a high latency to perform the network routing, limiting the full utilization of OINs. In this paper, we design a novel low-latency Hybrid Controller (HyCo) that employs acceleration techniques to reduce its execution time. HyCo is developed based on integrating centralized and distributed control techniques as well as by using pre-calculated network routes and a Bloom filter, all of which result in a considerable reduction in HyCo’s latency. Simulation and prototyping results for networks up to 64×64 indicate a latency smaller than 50 ns, in the worst-case scenario.","PeriodicalId":229411,"journal":{"name":"2021 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129961704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FPGA Prototyping of Systolic Array-based Accelerator for Low-Precision Inference of Deep Neural Networks","authors":"Soobeom Kim, Seunghwan Cho, Eunhyeok Park, S. Yoo","doi":"10.1109/rsp53691.2021.9806200","DOIUrl":"https://doi.org/10.1109/rsp53691.2021.9806200","url":null,"abstract":"In this study, we aim to design an energy-efficient computation system for deep neural networks on edge devices. To maximize energy efficiency, we design a novel hardware accelerator that supports low-precision computation and sparsity-aware structured zero-skipping on top of the well-known systolic-array structure. In addition, we introduce a full-stack software platform, including a model optimizer, instruction compiler, and host interface, to translate the pre-trained PyTorch model to the proposed accelerator and orchestrate it automatically. We validate the entire system by prototyping the accelerator on the Xilinx Alveo U250 FPGA board and demonstrating the inference of the 4-bit ResNet-50 model through the software stack. According to our experiment, our platform shows 317 GOPS inference speed and 51.96 GOPS/W energy efficiency for ResNet-50 on Xilinx Alveo U250 FPGA at 108 MHz, which is comparable to the advanced commercial acceleration system in terms of energy efficiency.","PeriodicalId":229411,"journal":{"name":"2021 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133340956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrating Quick Resource Estimators in Hardware Construction Framework for Design Space Exploration","authors":"Bruno Ferres, O. Muller, F. Rousseau","doi":"10.1109/rsp53691.2021.9806276","DOIUrl":"https://doi.org/10.1109/rsp53691.2021.9806276","url":null,"abstract":"Hardware design processes often come with time-consuming iteration loops, as feedbacks generally result of long synthesis runs. It is even more true when multiple different implementations need to be compared to perform Design Space Exploration (DSE). In order to accelerate such flows and increase agility of developers — closing the gap with software development methodologies — we propose to use quick feedback generating transforms based on RTL circuit analysis for quicker convergence of exploration. We also introduce an Hardware Construction Language (HCL) based methodology to build explorable circuit generators, and demonstrate such usage over a General Matrix Multiply (GEMM) Chisel implementation. We demonstrates that using RTL estimation early in the exploration process results in ×7 less synthesis runs and ×4.1 faster convergence than an exhaustive synthesis process, and still achieves state of the art performances when targetting a Xilinx VC709 FPGA.","PeriodicalId":229411,"journal":{"name":"2021 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"70 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134392028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prototyping FPGA through overlays","authors":"Théotime Bollengier, Loïc Lagadec, C. Teodorov","doi":"10.1109/RSP53691.2021.9806222","DOIUrl":"https://doi.org/10.1109/RSP53691.2021.9806222","url":null,"abstract":"EFPGAs give designers the flexibility to make changes at any point in the chip’s life span, even in the customers’ systems. Though, eFPGA are not efficient from an integration perspective, making proper dimensionning and tailoring mandatory. Unfortunately, designing an eFPGA is a complex and error-prone task. Even though automatic generation from high level models can produce correct-by-construction layouts, integration remains complex due to process variation. A key point is then to reduce the technology dependency.This paper presents the ELNATH project in which three implementations of the same architecture have been addressed: overlay, eFPGA, and 55 nm FPGA thanks to an open-source integrated tool flow that supports defining, implementing and programming reconfigurable architectures.","PeriodicalId":229411,"journal":{"name":"2021 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122939262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nils Büscher, Daniel Gis, Johann-Peter Wolff, C. Haubelt
{"title":"Data Augmentation Framework for Smart Sensor System Development Using the Sensor-in-the-Loop Prototyping Platform","authors":"Nils Büscher, Daniel Gis, Johann-Peter Wolff, C. Haubelt","doi":"10.1109/rsp53691.2021.9806209","DOIUrl":"https://doi.org/10.1109/rsp53691.2021.9806209","url":null,"abstract":"Sensor subsystems are becoming more complex as they increasingly take on various tasks, from signal processing to pattern recognition. This off-loading of continuous processes can decrease power consumption of the entire system, as the application processor of e.g. a smartwatch can spend more time in low-power modes. The design and validation of these sensor subsystems are thus becoming more difficult and time-consuming. This development process relies heavily on the availability of suitable sensor signals for testing. We propose a modular component-based framework to generate a multitude of tests using either prerecorded sensor signals or artificial signals to allow for a faster and more thorough development process and prototyping.Using the recently proposed Sensor-in-the-Loop architecture, the work at hand demonstrates not only the ability to test and develop the software offline in a simulation, but to also use the augmented sensor signal data directly on the hardware prototype at run-time. The framework can be used in various testing- and development setups to simulate sensor characteristics, processing steps and errors. We show, based on three comprehensive examples, that the proposed framework is able to simulate the common errors found in inertial MEMS sensors, generate signal traces to develop, train, and evaluate gesture recognition algorithms, and simulate pre-processing steps in order to evaluate their feasibility before they are implemented in the sensor firmware.","PeriodicalId":229411,"journal":{"name":"2021 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124057716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Harpreet Kaur, Georgiy Krylov, S. A. Damghani, K. Kent
{"title":"Heterogeneous Logic Implementation for Adders in VTR","authors":"Harpreet Kaur, Georgiy Krylov, S. A. Damghani, K. Kent","doi":"10.1109/rsp53691.2021.9806205","DOIUrl":"https://doi.org/10.1109/rsp53691.2021.9806205","url":null,"abstract":"Verilog-to-Routing (VTR) is a Field-Programmable Gate Array (FPGA) Computer-Aided Design (CAD) tool. It is composed of three tools, namely ODIN II, ABC and VPR with each performing distinctive optimizations at different stages of the design flow. The elaboration and hard block synthesis stage of VTR is the core responsibility of the sub-project ODIN II. This work enables ODIN II to use fewer hard adders in the circuit by allowing soft logic implementation alongside hard logic for circuits featuring addition operations. This is particularly useful in scenarios where a sufficient number of hard blocks are not available. The results of applying our modifications to ODIN II as well as the entire VTR flow have been analysed. The results reveal the potential of current adder optimizations to achieve up to 17% performance gains in terms of critical path delays. Another effect of the optimization is the implications on the resulting device size. Some future prospects in this respect are also outlined in this paper.","PeriodicalId":229411,"journal":{"name":"2021 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"119 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128672430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An FPGA-based Emulation Platform for Edge Computing Node Design Exploration","authors":"Theo Soriano, D. Novo, P. Benoit","doi":"10.1109/rsp53691.2021.9806230","DOIUrl":"https://doi.org/10.1109/rsp53691.2021.9806230","url":null,"abstract":"Recent advances in machine learning have made it possible to consider the implementation of smart applications in constrained systems at the edge of the network. These memory and Central Processing Unit (CPU) intensive applications may require specific exploration methodologies to design efficient node computing devices. To better guide and validate these explorations, we need to perform energy and performance evaluations of the system. Software-based evaluation tools are application-oriented and do not consider real-time and hardware constraints. Alternatively, hardware prototyping allows an accurate and real-time evaluation but offers limited flexibility and does not allow agile design exploration of the microcontroller unit (MCU). In this work, we propose a Field Programmable Gate Arrays (FPGA) based edge computing node emulation platform. Our solution combines the flexibility and the real-time capability of programmable logic with hardware prototype evaluation. We present an open-source microcontroller architecture for design exploration which integrates an activity monitor to collect traces at run-time. These activity traces are then used to profile the energy consumption of different components in the edge computing node. Importantly, our FPGA is connected to real sensors and communication modules to enable interactions with the environment during the node evaluation and exploration.","PeriodicalId":229411,"journal":{"name":"2021 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"189 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115600157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Neubauer, Leonard Masing, Michael Mahl, Jürgen Becker, Max Kramer, C. Reichmann
{"title":"Template-Driven and Hardware-Centric Cross-Domain E/E Architecture Simulation","authors":"K. Neubauer, Leonard Masing, Michael Mahl, Jürgen Becker, Max Kramer, C. Reichmann","doi":"10.1109/rsp53691.2021.9806231","DOIUrl":"https://doi.org/10.1109/rsp53691.2021.9806231","url":null,"abstract":"Due to various trends in the automotive sector, such as autonomous driving and electrification, the number of Electric/Electronic (E/E) components has risen in both hardware and software. This has led to an increase in certification requirements, which cannot be fulfilled without simulation anymore [1]. Different approaches have emerged trying to master this issue. However, for supporting early design decisions in the E/E development, these are either domain-specific or too elaborate. In this paper, we demonstrate an approach to realize early design decisions through a cross-domain simulation of E/E architectures, regarding the environment, scenarios, vehicle physics, the scheduling of software components and the power supply net. We use static E/E architecture hardware models, consisting of Electronic Control Units (ECUs), sensors, actuators and the wiring harness, as the base for the structure of our simulation models. The individual E/E components are linked to parameterizable simulation model templates to facilitate scalable execution. Moreover, scenarios are used for model reduction and supply the simulation model with stimuli. The simulation model is synthesized in an automated manner. For the evaluation, we simulate the power consumption of an electric vehicle, dependent on different loads. It shows that considering hardware aspects in early design phases uncovers errors that would have been noticed much later, e.g. when using virtual Hardware In the Loop (vHIL) methods. We also investigate the scalability of our approach. As E/E architecture modeling tool, we use Vector PREEvision and for the simulation Mathworks Simulink.","PeriodicalId":229411,"journal":{"name":"2021 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124488436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Instruction Set Design Methodology for In-Memory Computing through QEMU-based System Emulator","authors":"Kévin Mambu, H. Charles, J. Dumas, Maha Kooli","doi":"10.1109/rsp53691.2021.9806255","DOIUrl":"https://doi.org/10.1109/rsp53691.2021.9806255","url":null,"abstract":"In-Memory Computing (IMC) is a promising paradigm to mitigate the von Neumann bottleneck. However its evaluation on complete applications in the context of full-scale systems is limited by the complexity of simulation frameworks as well is the disjunction between hardware exploration and compiler support. This paper proposes a global exploration flow in the scale of Instruction Set Architectures (ISA) to perform both modeling and the generation of compiler support to perform ISA-level exploration. Our emulation methodology is based on QEMU, implements a performance model based on hardware characterizations from the State-of-the-Art, and allows the modeling of cache hierarchies, while our compiler support is automatically generated and based on a specialized compiler. We evaluate three applications in the domains of image processing and linear algebra on a reference IMC architecture, and analyze the obtained results to validate our methodology.","PeriodicalId":229411,"journal":{"name":"2021 IEEE International Workshop on Rapid System Prototyping (RSP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121025039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}