{"title":"Error Resilient Transformers: A Novel Soft Error Vulnerability Guided Approach to Error Checking and Suppression","authors":"Kwondo Ma, C. Amarnath, A. Chatterjee","doi":"10.1109/ETS56758.2023.10174239","DOIUrl":"https://doi.org/10.1109/ETS56758.2023.10174239","url":null,"abstract":"Transformer networks have achieved remarkable success in Natural Language Processing (NLP) and Computer Vision applications. However, the underlying large volumes of Transformer computations demand high reliability and resilience to soft errors in processor hardware. The objective of this research is to develop efficient techniques for design of error resilient Transformer architectures. To enable this, we first perform a soft error vulnerability analysis of every fully connected layers in Transformer computations. Based on this study, error detection and suppression modules are selectively introduced into datapaths to restore Transformer performance under anticipated error rate conditions. Memory access errors and neuron output errors are detected using checksums of linear Transformer computations. Correction consists of determining output neurons with out-of-range values and suppressing the same to zero. For a Transformer with nominal BLEU score of 52.7, such vulnerability guided selective error suppression can recover language translation performance from a BLEU score of 0 to 50.774 with as much as 0.001 probability of activation error, incurring negligible memory and computation overheads.","PeriodicalId":211522,"journal":{"name":"2023 IEEE European Test Symposium (ETS)","volume":"23 Vol. 23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127670763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learn to Tune: Robust Performance Tuning in Post-Silicon Validation","authors":"P. Domanski, D. Pflüger, Raphael Latty","doi":"10.1109/ETS56758.2023.10174123","DOIUrl":"https://doi.org/10.1109/ETS56758.2023.10174123","url":null,"abstract":"Post-silicon validation is a crucial yet challenging problem primarily due to the increasing complexity of the semi-conductor value chain. Existing techniques cannot keep up with the rapid increase in the complexity of designs. Therefore, post-silicon validation is becoming an expensive bottleneck. Robust performance tuning is relevant to compensate impacts of process variations and non-ideal design implementations. We propose a novel approach based on Deep Reinforcement Learning and Learn to Optimize. The method automatically learns flexible tuning strategies tailored to specific circuits. Additionally, it addresses high-dimensional tuning tasks, including mixed data types and dependencies, e.g., on operating conditions. In this work, we introduce Learn to Tune and demonstrate its appealing properties in post-silicon validation, e.g., lower computational cost or faster time-to-optimize, allowing a more efficient adaption of the tuning to changing tuning conditions than classical methods.","PeriodicalId":211522,"journal":{"name":"2023 IEEE European Test Symposium (ETS)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114888223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding Permanent Hardware Failures in Deep Learning Training Accelerator Systems","authors":"Yi He, Yanjing Li","doi":"10.1109/ETS56758.2023.10173972","DOIUrl":"https://doi.org/10.1109/ETS56758.2023.10173972","url":null,"abstract":"Hardware failures pose critical threats to deep neural network (DNN) training workloads, and the urgency of tackling this challenge (known as the Silent Data Corruption challenge in a broader context) has been raised widely by the industry. Based on industry reports, a large number of the failures observed in real systems are permanent hardware failures in logic. However, there is a very limited understanding of the effects that these failures can impose on DNN training workloads. In this paper, we present the first resilience study on this subject, focusing on deep learning (DL) training accelerator systems. We developed a fault injection framework to accurately simulate the effects of permanent faults, and conducted 100K fault injection experiments. Our results provide the fundamental understanding on how logic permanent hardware failures affect training workloads and eventually generate unexpected training outcomes. Based on this new knowledge, we developed efficient software-based detection and recovery techniques to mitigate logic permanent hardware failures that are likely to generate unexpected outcomes. Evaluation on Google Cloud TPUs shows that our techniques are effective and practical: they require 15−25 lines of code change, and introduce 0.004%−0.025% performance/energy overhead for various representative neural network models.","PeriodicalId":211522,"journal":{"name":"2023 IEEE European Test Symposium (ETS)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117249086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
X. Xhafa, A. Ladhar, E. Faehn, L. Anghel, G. D. Pendina, P. Girard, A. Virazel
{"title":"On Using Cell-Aware Methodology for SRAM Bit Cell Testing","authors":"X. Xhafa, A. Ladhar, E. Faehn, L. Anghel, G. D. Pendina, P. Girard, A. Virazel","doi":"10.1109/ETS56758.2023.10174118","DOIUrl":"https://doi.org/10.1109/ETS56758.2023.10174118","url":null,"abstract":"The shrinking of technology nodes has led to high density memories containing large amounts of transistors which are prone to defects and reliability issues. Their test is generally based on the use of well-known March algorithms targeting Functional Fault Models (FFMs). This paper presents a novel approach for memory testing which relies on Cell-Aware (CA) methodology to further improve the yield of System on Chips (SoCs). Consequently, using CA methodology converts memory testing from functional to structural testing. In this work, the preliminary flow of the CA-based memory testing methodology is presented. The generation of the CA model for the SRAM bit cell has been demonstrated as a case study. The generated CA model and the structural representation of the memory are used by the ATPG to test the bit cell in the presence of short and open defects. The generated test patterns are able to detect both static and dynamic faults in the bit cell with a test coverage of 100%.","PeriodicalId":211522,"journal":{"name":"2023 IEEE European Test Symposium (ETS)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128137241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BiSTAHL: A Built-In Self-Testable Soft-Error-Hardened Scan-Cell","authors":"S. Holst, Ruijun Ma, X. Wen, Aibin Yan, Hui Xu","doi":"10.1109/ETS56758.2023.10174154","DOIUrl":"https://doi.org/10.1109/ETS56758.2023.10174154","url":null,"abstract":"Ensuring the correct operation of modern VLSI circuits within safety-critical systems is essential since modern technology nodes are more susceptible to Early-Life Failures (ELFs) and radiation-induced Soft-Errors (SEs). Tackling both of these challenges leads to contradicting design requirements: Effective in-field ELF detection requires online-monitoring or periodic built-in self-testing with excellent cell-internal defect coverage. SE-hardened latch designs, however, are less testable because they are designed to mask cell-internal failures. We propose BiSTAHL, a new SE-hardened scan-cell design that is fully built-in self-testable for both production defects and ELFs.","PeriodicalId":211522,"journal":{"name":"2023 IEEE European Test Symposium (ETS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128616768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Gomony, A. Gebregiorgis, M. Fieback, M. Geilen, S. Stuijk, Jan Richter-Brockmann, R. Bishnoi, Sven Argo, Lara Arche Andradas, T. Güneysu, M. Taouil, H. Corporaal, S. Hamdioui
{"title":"Dependability of Future Edge-AI Processors: Pandora’s Box","authors":"M. Gomony, A. Gebregiorgis, M. Fieback, M. Geilen, S. Stuijk, Jan Richter-Brockmann, R. Bishnoi, Sven Argo, Lara Arche Andradas, T. Güneysu, M. Taouil, H. Corporaal, S. Hamdioui","doi":"10.1109/ETS56758.2023.10174180","DOIUrl":"https://doi.org/10.1109/ETS56758.2023.10174180","url":null,"abstract":"This paper addresses one of the directions of the HORIZON EU CONVOLVE project being dependability of smart edge processors based on computation-in-memory and emerging memristor devices such as RRAM. It discusses how how this alternative computing paradigm will change the way we used to do manufacturing test. In addition, it describes how these emerging devices inherently suffering from many non-idealities are calling for new solutions in order to ensure accurate and reliable edge computing. Moreover, the paper also covers the security aspects for future edge processors and shows the challenges and the future directions.","PeriodicalId":211522,"journal":{"name":"2023 IEEE European Test Symposium (ETS)","volume":"206 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123977471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Habiby, N. Lylina, Chih-Hao Wang, H. Wunderlich, S. Huhn, R. Drechsler
{"title":"Synthesis of IJTAG Networks for Multi-Power Domain Systems on Chips","authors":"P. Habiby, N. Lylina, Chih-Hao Wang, H. Wunderlich, S. Huhn, R. Drechsler","doi":"10.1109/ETS56758.2023.10174127","DOIUrl":"https://doi.org/10.1109/ETS56758.2023.10174127","url":null,"abstract":"The high-volume manufacturing test ensures the production of defect-free devices, which is of utmost importance when dealing with safety-critical systems. Such a high-quality test requires a deliberately designed scan network to provide a time and cost-effective access to many on-chip components, as included in state-of-the-art chip designs. The IEEE 1687 Std. (IJTAG) has been introduced to tackle this challenge by adding programmable components that enables the design of reconfigurable scan networks. Although these networks reduce the test time by shortening the scan chains’ lengths, the reconfiguration process itself incurs an additional time overhead. This paper proposes a heuristic method for designing customized multi-power domain reconfigurable scan networks with a minimized overall reconfiguration time. More precisely, the proposed method exploits a-priori given non-functional properties of the system, such as the power characteristics and the instruments’ access requirements. For the first time, these non-functional properties are considered to synthesize a well-adjusted and highly efficient multi-power domain network. The experimental results show a considerable improvement over the reported benchmark networks.","PeriodicalId":211522,"journal":{"name":"2023 IEEE European Test Symposium (ETS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121833958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lorenzo Masciullo, R. Passerone, F. Regazzoni, I. Polian
{"title":"Secrets Leaking Through Quicksand: Covert Channels in Approximate Computing","authors":"Lorenzo Masciullo, R. Passerone, F. Regazzoni, I. Polian","doi":"10.1109/ETS56758.2023.10174181","DOIUrl":"https://doi.org/10.1109/ETS56758.2023.10174181","url":null,"abstract":"Approximate computing (AxC) has emerged as an attractive architectural paradigm especially for artificial-intelligence applications, yet its security implications are being neglected. We demonstrate a novel covert channel where the malicious sender modulates transmission by switching between regular and AxC realizations of the same computational task. The malicious receiver identifies the transmitted information by either reading out the workload statistics or by creating controlled congestion. We demonstrate the channel on both an Android simulator and an actual smartphone and systematically study measures to increase its robustness. The achievable transmission rates are comparable with earlier covert channels based on power consumption, but the malicious behavior of our channel is more stealthy and less detectable.","PeriodicalId":211522,"journal":{"name":"2023 IEEE European Test Symposium (ETS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124673550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Increasing SAT-Resilience of Logic Locking Mechanisms using Formal Methods","authors":"M. Merten, S. Huhn, R. Drechsler","doi":"10.1109/ETS56758.2023.10173975","DOIUrl":"https://doi.org/10.1109/ETS56758.2023.10173975","url":null,"abstract":"Today, Integrated Circuits (ICs) manufactoring is distributed over various foundries, resulting in untrustworthy supply chains. Therefore, significant concerns about malicious intentions like intellectual property piracy of the fabricated ICs exist. Logic Locking (LL) is one well-known protection technique to improve the security of ICs. However, there are approaches to unlocking the circuit, like the SAT-based attack. Significant research has been done on thwarting the SAT-based attack by providing SAT-resilient LL. Nevertheless, these SAT-resilient LL approaches have an inherent structural footprint, yielding a high vulnerability to structural attacks. Recently, Polymorphic Logic Gates (PLGs) have been utilized to implement logic obfuscation by replacing gates. Reconfigurable Field Effect Transistors (RFETs) are a new emerging technology for implementing such PLGs due to their inherent camouflaging properties. This work proposes a novel technique for increasing SAT-resilience while introducing no structural weakness using those PLGs. In particular, based on the concept of an SAT-based attack, a procedure for determining the most SAT-resilient placement of LL-cells is developed. The experimental evaluation proves that the proposed hardening of the placement increases the SAT-resilience compared to a random placement while providing inherent camouflaging of RFET-cells.","PeriodicalId":211522,"journal":{"name":"2023 IEEE European Test Symposium (ETS)","volume":"64 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131960854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ching-Yuan Chen, Biresh Kumar Joardar, J. Doppa, P. Pande, K. Chakrabarty
{"title":"Attacking Memristor-Mapped Graph Neural Network by Inducing Slow-to-Write Errors","authors":"Ching-Yuan Chen, Biresh Kumar Joardar, J. Doppa, P. Pande, K. Chakrabarty","doi":"10.1109/ETS56758.2023.10174062","DOIUrl":"https://doi.org/10.1109/ETS56758.2023.10174062","url":null,"abstract":"Graph neural networks (GNNs) are becoming popular in various real-world applications. However, hardware-level security is a concern when GNN models are mapped to emerging neuromorphic technologies such as memristor-based crossbars. These security issues can lead to malfunction of memristor-mapped GNNs. We identify a vulnerability of memristor-mapped GNNs and propose an attack mechanism based on the identified vulnerability. The proposed attack tampers memristor-mapped graph-structured data of a GNN by injecting adversarial edges to the graph and inducing slow-to-write errors in crossbars. We show that 10% adversarial edge injection induces 1.11× longer write latency, eventually leading to a 44.33% error in node classification. Experimental results for the proposed attack also show that there is a 5.72× increase in the success rate compared to a software-based baseline.","PeriodicalId":211522,"journal":{"name":"2023 IEEE European Test Symposium (ETS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133602417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}