{"title":"Once For All Skip: Efficient Adaptive Deep Neural Networks","authors":"Yu Yang, Di Liu, Hui Fang, Yi-Xiong Huang, Ying Sun, Zhi-Yuan Zhang","doi":"10.23919/DATE54114.2022.9774567","DOIUrl":"https://doi.org/10.23919/DATE54114.2022.9774567","url":null,"abstract":"In this paper, we propose a new module, namely once for all skip (OFAS), for adaptive deep neural networks to efficiently control the block skip within a DNN model. The novelty of OFAS is that it only needs to compute once for all skippable blocks to determine their execution states. Moreover, since adaptive DNN models with OFAS cannot achieve the best accuracy and efficiency in end-to-end training, we propose a reinforcement learning-based training method to enhance the training procedure. The experimental results with different models and datasets demonstrate the effectiveness and efficiency in comparison to the state of the arts. The code is available at https://github.com/ieslab-ynu/OFAS.","PeriodicalId":232583,"journal":{"name":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133966378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Guaranteed Activation of Capacitive Trojan Triggers During Post Production Test via Supply Pulsing","authors":"Bora Bilgic, S. Ozev","doi":"10.23919/DATE54114.2022.9774705","DOIUrl":"https://doi.org/10.23919/DATE54114.2022.9774705","url":null,"abstract":"Involvement of many parties in the production of integrated circuits (ICs) makes the process more vulnerable to tampering. Consequently, IC security has become an important challenge to tackle. One of the threat models in hardware security domain is the insertion of unwanted and malicious hardware components, known as Hardware Trojans (HTs). A malicious attacker can insert a small modification into the functional circuit that can cause havoc in the field. To make the Trojan circuit stealthy, trigger circuits are typically used. The purpose of the trigger circuit is to hide the Trojan activity during post-production testing, and to randomize activation conditions, thereby making it very difficult to diagnose even after failures. Trigger mechanisms for Trojans typically delay and randomize the outcome based on a subset of internal digital signals. While there are many different ways of implementing the trigger mechanisms, charge based mechanisms have gained popularity due to their small size. In this paper, we propose a scheme to ensure that the trigger mechanisms are activated during production testing even if the conditions specified by the malicious attacker are not met. By disabling the mechanism that makes the Trojan stealthy, any of the parametric techniques can be used to detect Trojans at production time. The proposed technique relies on supply pulsing, where an increased potential difference between the gate and bulk of the active transistor in the output stage generates an alternate charge path for an otherwise unreachable capacitor and bypasses the input conditions to the trigger mechanism. SPICE simulations show that our method works well even for the smallest Trojan trigger mechanisms.","PeriodicalId":232583,"journal":{"name":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116807362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Lai, Tzu-Chieh Chiang, Po-Shen Kuo, Wanqiu Wang, Yan-Lin Hung, Hung-Ming Chen, Chi Liu, S. Jou
{"title":"DASC: A DRAM Data Mapping Methodology for Sparse Convolutional Neural Networks","authors":"B. Lai, Tzu-Chieh Chiang, Po-Shen Kuo, Wanqiu Wang, Yan-Lin Hung, Hung-Ming Chen, Chi Liu, S. Jou","doi":"10.23919/DATE54114.2022.9774608","DOIUrl":"https://doi.org/10.23919/DATE54114.2022.9774608","url":null,"abstract":"The data transferring of sheer model size of CNN (Convolution Neural Network) has become one of the main performance challenges in modern intelligent systems. Although pruning can trim down substantial amount of non-effective neurons, the excessive DRAM accesses of the non-zero data in a sparse network still dominate the overall system performance. Proper data mapping can enable efficient DRAM accesses for a CNN. However, previous DRAM mapping methods focus on dense CNN and become less effective when handling the compressed format and irregular accesses of sparse CNN. The extensive design space search for mapping parameters also results in a time-consuming process. This paper proposes DASC: a DRAM data mapping methodology for sparse CNNs. DASC is designed to handle the data access patterns and block schedule of sparse CNN to attain good spatial locality and efficient DRAM accesses. The bank-group feature in modern DDR is further exploited to enhance processing parallelism. DASC also introduces an analytical model to facilitate fast exploration and quick convergence of parameter search in minutes instead of days from previous work. When compared with the state-of-the-art, DASC decreases the total DRAM latencies and attains an average of 17.1x, 14.3x, and 23.3x better DRAM performance for sparse AlexNet, VGG-16, and ResNet-50 respectively.","PeriodicalId":232583,"journal":{"name":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126130766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Pluggable Vector Unit for RISC-V Vector Extension","authors":"V. Maisto, A. Cilardo","doi":"10.23919/DATE54114.2022.9774501","DOIUrl":"https://doi.org/10.23919/DATE54114.2022.9774501","url":null,"abstract":"Vector extensions have become increasingly important for accelerating data-parallel applications in areas like multimedia, data-streaming, and Machine Learning. This interactive presentation in-troduces a microarchitectural design of a vector unit compliant with the RISC- V vector extension v1.0. While we targeted a specific core for demonstration, CVA6, our architecture is designed so as to ensure extensibility, maintainability, and re-usability in other cores. Furthermore, as a distinctive feature, we support speculative execution and precise vector traps. The paper provides an overview of the main motivation, design choices, and implementation details, followed by a qualitative and quantitative discussion of the results collected from the synthesis of the extended CVA6 RISC-V core.","PeriodicalId":232583,"journal":{"name":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123782881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Bas, Pedro Benedicte, S. Alcaide, Guillem Cabo, Fabio Mazzocchetti, J. Abella
{"title":"SafeDM: a Hardware Diversity Monitor for Redundant Execution on Non-Lockstepped Cores","authors":"F. Bas, Pedro Benedicte, S. Alcaide, Guillem Cabo, Fabio Mazzocchetti, J. Abella","doi":"10.23919/DATE54114.2022.9774540","DOIUrl":"https://doi.org/10.23919/DATE54114.2022.9774540","url":null,"abstract":"Computing systems in the safety domain, such as those in avionics or space, require specific safety measures related to the criticality of the deployment. A problem these systems face is that of transient failures in hardware. A solution commonly used to tackle potential failures is to introduce redundancy in these systems, for example 2 cores that execute the same program at the same time. However, redundancy does not solve all potential failures, such as Common Cause Failures (CCF), where a single fault affects both cores identically (e.g. a voltage droop). If both redundant cores have identical state when the fault occurs, then there may be a CCF since the fault can affect both cores in the same way. To avoid CCF it is critical to know that there is diversity in the execution amongst the redundant cores. In this paper we introduce SafeDM, a hardware Diversity Monitor that quantifies the diversity of each redundant processor to guarantee that CCF will not go unnoticed, and without needing to deploy lockstepped cores. SafeDM computes data and instruction diversity separately, using different techniques appropriate for each case. We integrate SafeDM in a RISC-V FPGA space MPSoC from Cobham Gaisler where SafeDM is proven effective with a large benchmark suite, incurring low area and power overheads. Overall, SafeDM is an effective hardware solution to quantify diversity in cores performing redundant execution.","PeriodicalId":232583,"journal":{"name":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122200221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OPACT: Optimization of Approximate Compressor Tree for Approximate Multiplier","authors":"Weihua Xiao, Cheng Zhuo, Weikang Qian","doi":"10.23919/DATE54114.2022.9774628","DOIUrl":"https://doi.org/10.23919/DATE54114.2022.9774628","url":null,"abstract":"Approximate multipliers have attracted significant attention of researchers for designing low-power systems. The most area-consuming part of a multiplier is its compressor tree (CT). Hence, the prior works proposed various approximate compressors to reduce the area of the CT. However, the compression strategy for the approximate compressors has not been systematically studied: Most of the prior works apply their ad hoc strategies to arrange approximate compressors. In this work, we propose OPACT, a method for optimizing approximate compressor tree for approximate multiplier. An integer linear programming problem is first formulated to co-optimize CT's area and error. Moreover, since different connection orders of the approximate compressors can affect the error of an approximate multiplier, we formulate another mixed-integer programming problem for optimizing the connection order. The experimental results showed that OPACT can produce approximate multipliers with an average reduction of 24.4% and 8.4% in power-delay product and mean error distance, respectively, compared to the best existing designs with the same types of approximate compressors used.","PeriodicalId":232583,"journal":{"name":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125143342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ravindra Metta, Raveendra Kumar Medicherla, S. Chakraborty
{"title":"BMC+Fuzz: Efficient and Effective Test Generation","authors":"Ravindra Metta, Raveendra Kumar Medicherla, S. Chakraborty","doi":"10.23919/DATE54114.2022.9774672","DOIUrl":"https://doi.org/10.23919/DATE54114.2022.9774672","url":null,"abstract":"Coverage Guided Fuzzing (CGF) is a greybox test generation technique. Bounded Model Checking (BMC) is a whitebox test generation technique. Both these have been highly successful at program coverage as well as error detection. It is well known that CGF fails to cover complex conditionals and deeply nested program points. BMC, on the other hand, fails to scale for programming features such as large loops and arrays. To alleviate the above problems, we propose (1) to combine BMC and CGF by using BMC for a short and potentially incomplete unwinding of a given program to generate effective initial test prefixes, which are then extended into complete test inputs for CGF to fuzz, and (2) in case BMC gets stuck even for the short unwinding, we automatically identify the reason, and rerun BMC with a corresponding remedial strategy. We call this approach as BMCFuzz and implemented it in the VeriFuzz framework. This implementation was experimentally evaluated by participating in Test-Comp 2021 and the results show that BMCFuzz is both effective and efficient at covering branches as well as exposing errors. In this paper, we present the details of BMCFuzz and our analysis of the experimental results.","PeriodicalId":232583,"journal":{"name":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127085971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Counteract Side-Channel Analysis of Neural Networks by Shuffling","authors":"Manuel Brosch, Matthias Probst, G. Sigl","doi":"10.23919/DATE54114.2022.9774710","DOIUrl":"https://doi.org/10.23919/DATE54114.2022.9774710","url":null,"abstract":"Machine learning is becoming an essential part in almost every electronic device. Implementations of neural networks are mostly targeted towards computational performance or memory footprint. Nevertheless, security is also an important part in order to keep the network secret and protect the intellectual property associated to the network. Especially, since neural network implementations are demonstrated to be vulnerable to side-channel analysis, powerful and computational cheap countermeasures are in demand. In this work, we apply a shuffling countermeasure to a microcontroller implementation of a neural network to prevent side-channel analysis. The countermeasure is effective while the computational overhead is low. We investigate the extensions necessary for our countermeasure, and how shuffling increases the effort for an attack in theory. In addition, we demonstrate the increase in effort for an attacker through experiments on real side-channel measurements. Based on the mechanism of shuffling and our experimental results, we conclude that an attack on a commonly used neural network with shuffling is no longer feasible in a reasonable amount of time.","PeriodicalId":232583,"journal":{"name":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128830615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Wessman, F. Malatesta, Stefano Ribes, J. Andersson, Antonio García-Vilanova, M. Masmano, Vicente Nicolau, Paco Gomez, Jimmy Le Rhun, S. Alcaide, Guillem Cabo, F. Bas, Pedro Benedicte, Fabio Mazzocchetti, J. Abella
{"title":"De-RISC: A Complete RISC-V Based Space-Grade Platform","authors":"N. Wessman, F. Malatesta, Stefano Ribes, J. Andersson, Antonio García-Vilanova, M. Masmano, Vicente Nicolau, Paco Gomez, Jimmy Le Rhun, S. Alcaide, Guillem Cabo, F. Bas, Pedro Benedicte, Fabio Mazzocchetti, J. Abella","doi":"10.23919/DATE54114.2022.9774557","DOIUrl":"https://doi.org/10.23919/DATE54114.2022.9774557","url":null,"abstract":"The H2020 EIC-FTI De-RISC project develops a RISC-V space-grade platform to jointly respond to several emerging, as well as longstanding needs in the space domain such as: (1) higher performance than that of monocore and basic multicore space-grade processors in the market; (2) access to an increasingly rich software ecosystem rather than sticking to the slowly fading SPARC and PowerPC-based ones; (3) freedom (or drastic reduction) of export and license restrictions imposed by commercial ISAs such as Arm; and (4) improved support for the design and validation of safety-related real-time applications, (5) being the platform with software qualified and hardware designed per established space industry standards. De-RISC partners have set up the different layers of the platform during the first phases of the project. However, they have recently boosted integration and assessment activities. This paper introduces the De-RISC space platform, presents recent progress such as enabling virtualization and software qualification, new MPSoC features, and use case deployment and evaluation, including a comparison against other commercial platforms. Finally, this paper introduces the ongoing activities that will lead to the hardware and fully qualified software platform at TRL8 on FPGA by September 2022.","PeriodicalId":232583,"journal":{"name":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129099057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haoyu Yang, Kit Fung, Yuxuan Zhao, Yibo Lin, Bei Yu
{"title":"Mixed-Cell-Height Legalization on CPU-GPU Heterogeneous Systems","authors":"Haoyu Yang, Kit Fung, Yuxuan Zhao, Yibo Lin, Bei Yu","doi":"10.23919/DATE54114.2022.9774671","DOIUrl":"https://doi.org/10.23919/DATE54114.2022.9774671","url":null,"abstract":"Legalization conducts refinements on post-globalplacement cell location to compromise design constraints and parameters. These include placement fence regions, power/ground rail alignments, timing, wire length and etc. In advanced technology nodes, designs can easily contain millions of mutiple-row standard cells, which challenges the scalability of modern legalization algorithms. In this paper, for the first time, we investigate dedicated legalization algorithms on heterogeneous platforms, which promises intelligent usage of CPU and GPU resources and hence provides new algorithm design methodologies for large scale physical design problems. Experimental results on IC/CAD 2017 and ISPD 2015 contest benchmarks demonstrate the effectiveness and the efficiency of the proposed algorithm, compared to the state-of-the-art legalization solution for mixedcell-height designs.","PeriodicalId":232583,"journal":{"name":"2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127835509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}