{"title":"Detecting intermittent resistive faults in digital CMOS circuits","authors":"Hassan Ebrahimi, Alireza Rohani, H. Kerkhoff","doi":"10.1109/DFT.2016.7684075","DOIUrl":"https://doi.org/10.1109/DFT.2016.7684075","url":null,"abstract":"Interconnection reliability threats dependability of highly critical electronic systems. One of most challenging interconnection-induced reliability threats are intermittent resistive faults (IRFs). The occurrence rate of this kind of defects can take e.g. one month, and the duration of defects can be as short as a few nanoseconds. As a result, evoking and detecting these faults is a big challenge. IRFs can cause timing deviations in data paths in digital systems during its operating time. This paper proposes an online digital slack monitor which is able to detect small timing deviations caused by IRFs in digital systems. The simulation results show that the proposed monitor is effective in detecting IRFs.","PeriodicalId":248988,"journal":{"name":"2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126102313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Chahal, V. Tenentes, Daniele Rossi, B. Al-Hashimi
{"title":"BTI aware thermal management for reliable DVFS designs","authors":"H. Chahal, V. Tenentes, Daniele Rossi, B. Al-Hashimi","doi":"10.1109/DFT.2016.7684059","DOIUrl":"https://doi.org/10.1109/DFT.2016.7684059","url":null,"abstract":"In this paper, we show that dynamic voltage and frequency scaling (DVFS) designs, together with stress-induced BTI variability, exhibit high temperature-induced BTI variability, depending on their workload and operating modes. We show that the impact of temperature-induced variability on circuit lifetime can be higher than that due to stress and exceed 50% over the value estimated considering the circuit average temperature. In order to account for these variabilities in lifetime estimation at design time, we propose a simulation framework for the BTI degradation analysis of DVFS designs accounting for workload and actual temperature profiles. A profile is generated considering statistically probable workload and thermal management constraints by means of the HotSpot tool. Using the proposed framework we explore the expected lifetime of the Ethernet circuit from the IWLS05 benchmark suite, synthesized with a 32nm CMOS technology library, for various thermal management constraints. We show that margin-based design can underestimate or overestimate lifetime of DVFS designs by up to 67.8% and 61.9%, respectively. Therefore, the proposed framework allows designers to select appropriately the dynamic thermal management constraints in order to tradeoff long-term reliability (lifetime) and performance with upto 35.8% and 26.3% higher accuracy, respectively, against a temperature-variability unaware BTI analysis.","PeriodicalId":248988,"journal":{"name":"2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120937729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient utilization of hierarchical iJTAG networks for interrupts management","authors":"Ahmed M. Y. Ibrahim, H. Kerkhoff","doi":"10.1109/DFT.2016.7684077","DOIUrl":"https://doi.org/10.1109/DFT.2016.7684077","url":null,"abstract":"Modern systems-on-chips rely on embedded instruments for testing and debugging, the same instruments could be used for managing the lifetime dependability of the chips. The IEEE 1687 (iJTAG) standard introduces an access network to the instruments based on reconfigurable scan paths. During lifetime, instruments could be required to initiate communication with a system-level dependability manager for different reasons. For example, fault/event occurrences or measurement read-out requests; however iJTAG networks are inherently master/slave networks, where the instruments are the network slaves. In this work, a scalable interrupts-management methodology is presented for allowing instruments-initiated communication using hierarchical iJTAG networks. The presented method allows for an efficient access of the network according to the required use-case by allowing the network to be configured into a corresponding optimized mode. In addition, a novel on-chip localization methodology is presented, which significantly reduces the localization time of interrupting instruments as compared to previous works.","PeriodicalId":248988,"journal":{"name":"2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134242036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fault-tolerant scheduling of multicore mixed-criticality systems under permanent failures","authors":"Zaid Al-bayati, B. Meyer, Haibo Zeng","doi":"10.1109/DFT.2016.7684070","DOIUrl":"https://doi.org/10.1109/DFT.2016.7684070","url":null,"abstract":"Mixed-criticality systems are real-time systems that combine both safety-critical and non-critical applications. These systems have been gaining interest due to their practical applications and their adoption in standards in several domains. On the other hand, it is often overlooked in the research on mixed-criticality systems that these systems are still safety critical and must be able to operate even when the system's processors fail. In this paper, we present an approach to design multicore mixed-criticality systems that can survive permanent processor failures. Under the proposed work, critical tasks executing on the failing cores are migrated to other cores to allow them to continue execution. Space is made on the new cores by dropping non-critical tasks. Schedulability analysis is developed by extending the AMC-rtb analysis to support processor failures. The problem of finding a mixed-criticality system configuration on a multicore architecture is formulated as a Mixed Integer Linear Programming (MILP) problem. The MILP produces a system design that is schedulable on the underlying platform and tolerant to processor failures.","PeriodicalId":248988,"journal":{"name":"2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115477676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Aliee, Stefan Vitzethum, M. Glaß, J. Teich, E. Borgonovo
{"title":"Guiding Genetic Algorithms using importance measures for reliable design of embedded systems","authors":"H. Aliee, Stefan Vitzethum, M. Glaß, J. Teich, E. Borgonovo","doi":"10.1109/DFT.2016.7684069","DOIUrl":"https://doi.org/10.1109/DFT.2016.7684069","url":null,"abstract":"Reliability importance measures (IMs) support analysts in understanding the contributions of components to the reliability of the system under investigation. This understanding can be of use to improve the reliability of a system and at the same time, restrict the cost penalty by upgrading only the highly important components to more reliable ones. This paper studies how IMs can enhance the design of embedded systems, more specifically to guide the optimization process. The observations are later employed to modify a well-known Genetic Algorithm (GA) to create new offsprings using the IMs of the components of their parents. The experimental results prove the efficiency of the proposed algorithm which not only seeks for more reliable designs, but also reckons with other design objectives-in this paper resource cost and power consumption-concurrently to ensure that they are not degraded through the optimization process.","PeriodicalId":248988,"journal":{"name":"2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126760916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"In-place LUT polarity inVersion to mitigate soft errors for FPGAs","authors":"J. Su, Ju-Yueh Lee, Chang Wu, Lei He","doi":"10.1109/DFT.2016.7684074","DOIUrl":"https://doi.org/10.1109/DFT.2016.7684074","url":null,"abstract":"In-place Polarity inVersion (IPV) has been proposed to mitigate the single event upset (SEU) induced soft errors for academic VPR FPGA architectures, and this paper extends the original IPV so that it can be used for commercial FPGA architectures. Different from the original IPV, we use a new soft error model based on signal probability and propose a simple yet effective greedy based algorithm. To validate the effectiveness of IPV 2.0, we map circuits by ISE followed by IPV 2.0 to a Xilinx Virtex-5 x5vlx110t FPGA, and inject faults to the mapped circuits during run time. Experiments show that IPV 2.0 reduces soft errors by about 1.4× on average and up to 2× when compared to the circuits mapped by ISE without IPV 2.0.","PeriodicalId":248988,"journal":{"name":"2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115588818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bounding error detection latency in safety critical systems with enhanced Execution Fingerprinting","authors":"Mojing Liu, B. Meyer","doi":"10.1109/DFT.2016.7684068","DOIUrl":"https://doi.org/10.1109/DFT.2016.7684068","url":null,"abstract":"Latent soft errors may disrupt execution millions of instructions after their occurrence, wasting significant computational resources when recovery requires re-execution. While Execution Fingerprinting (EF) has emerged as a cost-effective fault detection alternative for automotive mixed-criticality multicore safety-critical systems, like lockstep execution, it suffers from unbounded error detection latency (EDL). We propose State Checkpointing (EF-SC) and Selective State Streaming (EF-SSS), which actively push register state into the fingerprinting stream, in a single burst or selectively over time, respectively. Given a maximum EDL of 20K instructions, EF-SC and EF-SSS experience 0.08 and 0.02% performance degradation and 0.96 and 1.06% fingerprinting overhead, respectively.","PeriodicalId":248988,"journal":{"name":"2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126186684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and analysis of an approximate 2D convolver","authors":"Ke Chen, F. Lombardi, Jie Han","doi":"10.1109/DFT.2016.7684065","DOIUrl":"https://doi.org/10.1109/DFT.2016.7684065","url":null,"abstract":"This paper proposes a two-dimensional (2D) convolver in which both approximate circuit- and algorithm-level techniques are utilized in the design. Truncation is used as circuit techniques, while bit-width reduction is utilized at the algorithm level. These different techniques are related to the configuration of the convolver by which its operation can be configured to meet different and often contrasting figures of merit. Circuit-level simulation (using HSPICE) and an extensive evaluation of different error metrics, generic metrics such as the mean error distance (MED) and the peak signal noise ratio (PSNR) for image convolution are performed. A detailed error analysis is also presented to substantiate the simulation results. Convolution for image processing (Gaussian smoothing) is treated in detail to show the effectiveness of the proposed approach. The design, the analysis and the simulation results show that the approximate techniques utilized in the inexact convolver can operate in synergy.","PeriodicalId":248988,"journal":{"name":"2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123525725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low cost resilient regular expression matching on FPGAs","authors":"Marcos T. Leipnitz, E. N. D. Souza, G. Nazar","doi":"10.1109/DFT.2016.7684073","DOIUrl":"https://doi.org/10.1109/DFT.2016.7684073","url":null,"abstract":"The Network Function Virtualization (NFV) paradigm promises to make networks more scalable and flexible by decoupling the network functions (NFs) from dedicated and vendor-specific hardware. However, network and compute intensive NFs may be difficult to virtualize without performance degradation. In this context, Field-Programmable Gate Arrays (FPGAs) have been shown to be a good option for hardware acceleration of virtual NFs that require high throughput, without deviating from the concept of an NFV infrastructure which aims at high flexibility. Regular expression matching is an important and compute intensive mechanism used to perform Deep Packet Inspection, which can be FPGA-accelerated to meet performance constraints. This solution, however, introduces new challenges regarding dependability requirements. Particularly for FPGAs, soft errors on the configuration memory are a significant dependability threat. Therefore, in this work we evaluate the effects of configuration faults on the functionality of FPGA-based regular expression matching engines and propose fault tolerance mechanisms to improve its resilience against such faults.","PeriodicalId":248988,"journal":{"name":"2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121406374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prognosis of NBTI aging using a machine learning scheme","authors":"Naghmeh Karimi, K. Huang","doi":"10.1109/DFT.2016.7684060","DOIUrl":"https://doi.org/10.1109/DFT.2016.7684060","url":null,"abstract":"Circuit aging is an important failure mechanism in nanoscale designs and is a growing concern for the reliability of future systems. Aging results in circuit performance degradation over time and the ultimate circuit failure. Among aging mechanisms, Negative-Bias Temperature Instability (NBTI) is the main limiting factor of circuits lifetime. Estimating the effect of aging-related degradation, before it actually occurs, is crucial for developing aging prevention/mitigations actions to avoid circuit failures. In this paper, we propose a general-purpose IC aging prognosis approach by considering a comprehensive set of IC operating conditions including workload, usage time and operating temperature. In addition, our model considers process variation by using a calibration technique applied at the time of manufacturing. Experimental results confirms that our model is able to accurately predict the NBTI-related path delay degradation under various operating conditions. The proposed model is robust to process variations.","PeriodicalId":248988,"journal":{"name":"2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122565996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}