Christoph Gerum, Adrian Frischknecht, T. Hald, Paul Palomero Bernardo, Konstantin Lübeck, O. Bringmann
{"title":"Hardware Accelerator and Neural Network Co-Optimization for Ultra-Low-Power Audio Processing Devices","authors":"Christoph Gerum, Adrian Frischknecht, T. Hald, Paul Palomero Bernardo, Konstantin Lübeck, O. Bringmann","doi":"10.1109/DSD57027.2022.00056","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00056","url":null,"abstract":"The increasing spread of artificial neural networks does not stop at ultralow-power edge devices. However, these very often have high computational demand and require specialized hardware accelerators to ensure the design meets power and performance constraints. The manual optimization of neural networks along with the corresponding hardware accelerators can be very challenging. This paper presents HANNAH (Hardware Accelerator and Neural Network seArcH), a framework for automated and combined hardware/software co-design of deep neural networks and hardware accelerators for resource and power-constrained edge devices. The optimization approach uses an evolution-based search algorithm, a neural network template technique and analytical KPI models for the configurable UltraTrail hardware accelerator template in order to find an optimized neural network and accelerator configuration. We demonstrate that HANNAH can find suitable neural networks with minimized power consumption and high accuracy for different audio classification tasks such as single-class wake word detection, multi-class keyword detection and voice activity detection, which are superior to the related work.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"368 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133335504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Galimberti, D. Galli, Gabriele Montanaro, W. Fornaciari, Davide Zoni
{"title":"FPGA implementation of BIKE for quantum-resistant TLS","authors":"Andrea Galimberti, D. Galli, Gabriele Montanaro, W. Fornaciari, Davide Zoni","doi":"10.1109/DSD57027.2022.00078","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00078","url":null,"abstract":"The recent advances in quantum computers impose the adoption of post-quantum cryptosystems into secure communication protocols. This work proposes two FPGA-based, client- and server-side hardware architectures to support the integration of the BIKE post-quantum KEM within TLS. Thanks to the parametric hardware design, the paper explores the best option between hardware and software implementations, given a set of available hardware resources and a realistic use-case scenario. The experimental evaluation comparing our client and server designs against the reference AVX2 and hardware implementations of BIKE highlighted two aspects. First, the proposed client and server architectures outperform the reference hardware implementation of BIKE by eight and four times, respectively. Second, the performance comparison between our client and server designs against the reference AVX2 implementation strongly depends on the available resource. Our solution is almost twice as fast as the AVX2 implementation while implemented on the Artix-7 200 FPGA, while it is up to six times slower when targeting smaller FPGAs, thus motivating a careful analysis of the available hardware resources and the optimization of the design's parallelism before opting for hardware support.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114761135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Halima Bouzidi, Hamza Ouarnoughi, S. Niar, E. Talbi, Abdessamad Ait El Cadi
{"title":"Co-Optimization of DNN and Hardware Configurations on Edge GPUs","authors":"Halima Bouzidi, Hamza Ouarnoughi, S. Niar, E. Talbi, Abdessamad Ait El Cadi","doi":"10.1109/DSD57027.2022.00060","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00060","url":null,"abstract":"The ever-increasing complexity of both Deep Neural Networks (DNN) and hardware accelerators has made the co-optimization of these domains extremely complex. Previous works typically focus on optimizing DNNs given a fixed hardware configuration or optimizing a specific hardware architecture given a fixed DNN model. Recently, the importance of the joint exploration of the two spaces drew more and more attention. Our work targets the co-optimization of DNN and hardware configurations on edge GPU accelerators. We propose an evolutionary-based co-optimization strategy by considering three metrics: DNN accuracy, execution latency, and power consumption. By combining the two search spaces, a larger number of configurations can be explored in a short time interval. In addition, a better tradeoff between DNN accuracy and hardware efficiency can be obtained. Experimental results show that the co-optimization outperforms the optimization of DNN for fixed hardware configuration with up to 53% hardware efficiency gains with the same accuracy and inference time.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128601158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Hybrid Scheduling Mechanism for Multi-programming in Mixed-Criticality Systems","authors":"Mohammad Bawatna, Behnaz Ranjbar, Akash Kumar","doi":"10.1109/DSD57027.2022.00033","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00033","url":null,"abstract":"In the last decade, the rapid evolution of the Commercial-Off-The-Shelf (COTS) platforms led safety-critical systems towards integrating tasks and applications with different criticality levels in a shared hardware platform, i.e., Mixed-Criticality Systems (MCS)s. Therefore, several scheduling algorithms and approaches have been proposed upon a commonly used model, i.e., Vestal's model. However, consolidating software functions onto shared processors cannot be implemented directly in real-life applications and industrial systems while complying with certification requirements. The existing scheduling approaches do not provide a simple solution for eliminating the interference effect among the tasks with different criticality levels on the shared processing resources. Moreover, the system mode switch guarantees the timing constraints of the high-criticality tasks throw the termination of the low-criticality tasks. In this paper, we developed a new scheduling algorithm that addresses these challenges based on the round-robin technique, which improves the overall schedulability. We compared the proposed algorithm against existing scheduling algorithms in both academia and industry using extensive experiments to evaluate it. Our results show improvements in the schedulability from 0.8% to 14.0% and from 2.7% to 10.7% compared to the conventional Earliest Deadline First with Virtual Deadline (EDF-VD) and Fixed Priority Preemptive (FPP) scheduling approaches, respectively.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133955307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Reza Heidari Iman, J. Raik, G. Jervan, Tara Ghasempouri
{"title":"IMMizer: An Innovative Cost-Effective Method for Minimizing Assertion Sets","authors":"Mohammad Reza Heidari Iman, J. Raik, G. Jervan, Tara Ghasempouri","doi":"10.1109/DSD57027.2022.00095","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00095","url":null,"abstract":"Assertion-based verification is one of the viable solutions for the verification of computer systems. Assertions can be automatically generated by assertion miners however, these miners typically generate a high number of possibly redundant assertions. In turn, this results in higher costs and overheads in the verification process. Furthermore, these assertions have every so often low readability due to the high number of propositions that they contain. In this paper, an Innovative cost-effective Method for Minimizing assertion sets (IMMizer) has been proposed. IMMizer is performed by iden-tifying Contradictory Terms. These terms present the behaviors of the design under verification which are not specified by the initial assertion sets. Subsequently, a new assertion set is extracted based on the identified Contradictory Terms. Contrary to data-mining approaches that are unable to minimize the initial assertion set, but can only rank the set according to data-mining measurements, or mutant analysis approaches that require a long execution time, IMMizer is able to minimize the initial assertion set in a very short execution time. Experimental results showed that in the best case, this method has drastically reduced the number of assertions by 93% and the memory overhead imposed on the system by 87%, without any reduction in the detection of injected mutants.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133057534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liliana Granados-Castro, O. Gutiérrez-Navarro, I. A. Cruz-Guerrero, Juan N. Mendoza-Chavarría, Eric R. Zavala-Sánchez, D. U. Campos‐Delgado
{"title":"Estimation of deoxygenated and oxygenated hemoglobin by multispectral blind linear unmixing","authors":"Liliana Granados-Castro, O. Gutiérrez-Navarro, I. A. Cruz-Guerrero, Juan N. Mendoza-Chavarría, Eric R. Zavala-Sánchez, D. U. Campos‐Delgado","doi":"10.1109/DSD57027.2022.00119","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00119","url":null,"abstract":"Blood perfusion parameters can be used to evaluate the micro-circulatory health condition of a patient. Several non-invasive optical techniques have been used to estimate blood perfusion as near-infrared spectroscopy or pulse-oximetry. However, these techniques require contact with the patient, and the measurements are restricted to a single point evaluation. These disadvantages could be solved by multispectral imaging. Hence, this paper presents an approach based on multispectral imaging and blind linear unmixing, as an alternative to estimate blood perfusion parameters in the hand palm. This work evaluated changes in oxygenated and deoxygenated hemoglobin concentrations by employing an experimental occlusion protocol in healthy volunteers. We compared the results of several blind linear unmixing and linear regression models. The average cosine similarity values between the prediction model and the photoplethysmography estimations varied in the range 87% and 96%. The mean R-squared adjusted values for oxygenated and deoxygenated hemoglobin were greater or equal than 0.75 and 0.84, respectively. Our results demonstrated the feasibility of non-invasive estimation of hemoglobin in the hand palm, and opened the possibility for calculating other perfusion parameters that help to diagnose and monitor pathologies in large tissue regions.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"23 14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125455635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Co-Optimizing Sensing and Deep Machine Learning in Automotive Cyber-Physical Systems","authors":"Joydeep Dey, S. Pasricha","doi":"10.1109/DSD57027.2022.00049","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00049","url":null,"abstract":"Accurate perception of the environment is critical to achieving safety and performance goals in emerging semi-autonomous vehicles. Building a perception architecture to support autonomy goals in vehicles requires solving many complex problems related to sensor selection and placement, sensor fusion, and machine leaning driven object detection. In this paper, we present a framework for co-optimizing sensing and machine learning to meet autonomy goals in emerging automotive cyber-physical systems. Experimental results that target level 2 autonomy goals for the Audi-TT and BMW-Minicooper vehicles demonstrate how our framework can intelligently traverse the massive design space to find robust, vehicle-specific perception architecture solutions.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126012640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Breaking (and Fixing) Channel-based Cryptographic Key Generation: A Machine Learning Approach","authors":"Ihsen Alouani","doi":"10.1109/DSD57027.2022.00058","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00058","url":null,"abstract":"Several systems and application domains are under-going disruptive transformations due to the recent breakthroughs in computing paradigms such us Machine Learning and commu-nication technologies such as 5G and beyond. Intelligent trans-portation systems is one of the flagship domains that witnessed drastic transformations through the development of ML-based environment perception along with Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) communication protocols. Such connected, intelligent and collaborative transportation systems represent a promising trend towards smart roads and cities. However, the safety-critical aspect of these cyber-physical systems requires a systematic study of their security and privacy. In fact, security-sensitive information could be transmitted between vehicles, or between vehicles and the infrastructure such as security alerts, payment, etc. Since asymmetric cryptography is heavy to implement on embedded time-critical devices, in addition to the complexity of PKI-based solutions, symmetric cryptography offers confidentiality along with high performance. However, cryptographic key generation and establishment in symmetric cryptosystems is a great challenge. Recent work proposed a key generation and establishment protocol for ve-hicular communication that is based on the reciprocity and high spatial and temporal variation properties of the vehicular communication channel. This paper investigates the limitations of such channel-based key generation protocols. Based on a channel model with a machine learning approach, we show the possibility for a passive eavesdropper to compromise the secret key in a practical manner, thereby undermining the security of such key establishment technique. Moreover, we propose a defense based on adversarial machine learning to overcome this limit.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124159716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abdul Khader Thalakkattu Moosa, Nilotpola Sarma, C. Karfa
{"title":"ImageSpec: Efficient High-Level Synthesis of Image Processing Applications","authors":"Abdul Khader Thalakkattu Moosa, Nilotpola Sarma, C. Karfa","doi":"10.1109/DSD57027.2022.00019","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00019","url":null,"abstract":"The necessity of efficient hardware accelerators for image processing kernels is a well known problem. Unlike the conventional HDL based design process, High-level Synthesis (HLS) can directly convert behavioral (C/C++) description into RTL code and can reduce design complexity, design time as well as provide user opportunity for design space exploration. Due to the vast optimization possibilities in HLS, a proper application level behavioral characterization is necessary to understand the leverages offered by these workloads especially for facilitating parallel computation. In this work, we present a set of HLS optimization strategies derived upon exploiting the most general HLS influential characteristic features of image processing algorithms. We also present an HLS benchmark suite ImageSpec to demonstrate our strategies and their efficiency in optimizing workloads spanning diverse domains within image processing sector. We have shown that an average performance to hardware gain of 143x could be achieved over the baseline implementation using our optimization strategies.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127895544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Nouacer, Raphaël Lallement, Rodrigo Castiñeira, Jean-Frédéric Real, J. Mascomère
{"title":"COMP4DRONES: Key Enabling Technologies for Drones to enhance Mobility and Logistics Operations","authors":"R. Nouacer, Raphaël Lallement, Rodrigo Castiñeira, Jean-Frédéric Real, J. Mascomère","doi":"10.1109/DSD57027.2022.00103","DOIUrl":"https://doi.org/10.1109/DSD57027.2022.00103","url":null,"abstract":"This Paper presents the achievements of the COMP4DRONES project [1]. It aims to raise awareness of the potential for future mobility and logistics applications by integrating drones in the Intelligent Transport Systems. This paper presents the outcomes of this European project that has developed key technologies to deploy innovative drone-based services. It is presented the results of two use-cases where different transport and mobility stakeholders have included drones in their operations to validate the COMP4DRONES framework as well as the key technologies to enable the use of drones in the mobility and the transport sector.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131461455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}