{"title":"Flip-Flop Hardening and Selection for Soft Error and Delay Fault Resilience","authors":"Mingjing Chen, A. Orailoglu","doi":"10.1109/DFT.2009.50","DOIUrl":"https://doi.org/10.1109/DFT.2009.50","url":null,"abstract":"The traditional test model of go/no-go testing being questioned by increasing delay fault manifestations has become even further challenged as a result of unpredictable soft errors. Consequent probabilistic fault manifestations shift the focus to fault resilience mechanisms and tradeoffs of false alarms vs. escapes. Fault manifestation at flip-flops necessitates solutions that rely on their hardening, possibly imposing inordinate cost as flip-flops constitute a significant fraction of current designs. A two-pronged approach for resolving this challenge is necessitated, consisting of frugal flip-flop designs, capable of withstanding such faults, and an economic rationalization model to enable a prioritized flip-flop selection within an overall design budget. In this paper, we propose a hardened flip-flop that increases circuit tolerance to soft errors and delay faults simultaneously and the associated selective hardening scheme guided by a unified quality evaluation framework. The proposed flip-flop supersedes previous research efforts and simulation results show that the outlined framework delivers yield recovery and FIT reduction at a minimized hardware cost.","PeriodicalId":405651,"journal":{"name":"2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131460352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reduced Precision Checking for a Floating Point Adder","authors":"Patrick J. Eibl, Andrew D. Cook, Daniel J. Sorin","doi":"10.1109/DFT.2009.22","DOIUrl":"https://doi.org/10.1109/DFT.2009.22","url":null,"abstract":"We present an error detection technique for a floating point adder which uses a checker adder of reduced precision to determine if the result is correct within some error bound. Our analysis establishes a relationship between the width of the checker adder’s mantissa and the worst-case magnitude of an undetected error in the primary adder’s result. This relationship allows for a tradeoff between error detection capability and area overhead that is not offered by any previously developed floating point adder checking schemes. Experimental results of fault injection experiments are presented which support our analysis.","PeriodicalId":405651,"journal":{"name":"2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133126640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Burst Error Detection Hybrid ARQ with Crosstalk-Delay Reduction for Reliable On-chip Interconnects","authors":"Bo Fu, P. Ampadu","doi":"10.1109/DFT.2009.45","DOIUrl":"https://doi.org/10.1109/DFT.2009.45","url":null,"abstract":"We present a hybrid ARQ (HARQ) scheme using single-error correcting burst-error detecting (SEC-BED) codes to address multiple errors in nanoscale on-chip interconnects. For a given residual flit error rate requirement, the proposed HARQ method yields 20% energy improvement over other burst error correction schemes. By further integrated with skewed transitions, the proposed HARQ method can efficiently improve the error resilience against burst errors and also reduce delay uncertainty caused by capacitive coupling. The low overhead of our approach makes it suitable for implementation in reliable and energy efficient on-chip communication.","PeriodicalId":405651,"journal":{"name":"2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121314461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Defect Tolerant and Performance Tunable Gate Architecture for End-of-Roadmap CMOS","authors":"A. Singh","doi":"10.1109/DFT.2009.65","DOIUrl":"https://doi.org/10.1109/DFT.2009.65","url":null,"abstract":"In addition to high defect rates, end-of-roadmap CMOS at sub-10 nm gate lengths is also expected to display significant random variability in individual device performance. This will lead to unique and varied slow paths (statistical performance outliers) in individual ICs, severely limiting achievable clock rates. While it is widely accepted that to ensure viable manufacturing yield and reliable field operation, future circuits will need to be equipped with significant defecttolerance capabilities, it is less commonly recognized that continued performance gains from scaling in the face of extreme parameter variations will also require a post manufacture performance tuning capability capable of speeding up the statistical slow paths on individual ICs that limit clock rates. In this paper, we show how a recently proposed defect-tolerant CMOS logic gate architecture can efficiently achieve both these goals. Our basic design exploits the inherent functional redundancy in static CMOS for defect tolerance; the CMOS logic gate is reconfigured to a pseudo-NMOS-like gate in the presence of a defect by using an appropriately sized single pull up or pull down transistor to replace the defective pull-up or pull down network. Thus the resulting defect-tolerant gate architecture can tolerate defects in either the pull-up or pull-down network and incurs only a modest area overhead. Multiple defects across the logic gates in a large CMOS design can also be tolerated. Importantly, these redundant pull up or pull down transistors can also be strategically turned on for performance tuning, speeding up a slow critical transition through a gate at the possible expense of a small slow down in the opposite transition. Results evaluating the effectiveness of the new defect tolerance and performance tuning technique are presented.","PeriodicalId":405651,"journal":{"name":"2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130402796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fan Yang, S. Chakravarty, Narendra Devta-Prasanna, S. Reddy, I. Pomeranz
{"title":"Improving the Detectability of Resistive Open Faults in Scan Cells","authors":"Fan Yang, S. Chakravarty, Narendra Devta-Prasanna, S. Reddy, I. Pomeranz","doi":"10.1109/DFT.2009.30","DOIUrl":"https://doi.org/10.1109/DFT.2009.30","url":null,"abstract":"Recent studies have shown that new tests are required for the detection of a large percentage of scan cell internal open faults which are not detected by the existing tests. However, the additional coverage due to the new tests drops significantly when opens with moderate resistances are considered. In this paper we propose to augment earlier test methods to detect internal scan chain opens with a wider range of resistances. The newly proposed method includes application of tests at higher temperatures and modifications to an earlier proposed flush test. We also present an analysis to explain the additional coverage obtained by the proposed test methods.","PeriodicalId":405651,"journal":{"name":"2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125853821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dreams, Plans, and Journey of Reaching Perfect Predictability and Reliability in ASICs","authors":"N. Sherwani","doi":"10.1109/DFT.2009.63","DOIUrl":"https://doi.org/10.1109/DFT.2009.63","url":null,"abstract":"","PeriodicalId":405651,"journal":{"name":"2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114402773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Sensor to Detect Normal or Reverse Temperature Dependence in Nanoscale CMOS Circuits","authors":"D. Wolpert, P. Ampadu","doi":"10.1109/DFT.2009.47","DOIUrl":"https://doi.org/10.1109/DFT.2009.47","url":null,"abstract":"The temperature dependence of MOSFET drain current varies with supply voltage. Two distinct voltage regions exist—a normal dependence (ND) region where an increase in temperature decreases drain current, and a reverse dependence (RD) region where an increase in temperature increases drain current. Knowledge of the temperature dependence is critical for avoiding overheating and wasted performance from excessive guardbands. In this paper, we present the first temperature dependence sensor to detect whether a system is operating in the ND or RD region. The dependence sensor occupies an area of 985 NAND2 equivalent gates. The sensor consumes 15.9 pJ per sample at a supply voltage of 1 V, with a 1°C resolution over the military-specified temperature range of -55°C to 125°C.","PeriodicalId":405651,"journal":{"name":"2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131389837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Memory Repair by Selective Row Partitioning","authors":"M. T. Rab, A. A. Bawa, N. Touba","doi":"10.1109/DFT.2009.20","DOIUrl":"https://doi.org/10.1109/DFT.2009.20","url":null,"abstract":"A new methodology for improving memory repair is presented which can be applied in either manufacture time repair or built-in self-repair (BISR) scenarios. In traditional memory repair, one spare column can only replace one column containing a defective cell. However, the proposed method allows a single spare column to be used to repair multiple defective cells in multiple columns. This is done by selectively decoding the row address bits when generating the control signals for the column MUXes. This logically segments the spare column allowing it to replace different columns in different partitions of the row address space. The hardware is the same for all chips, but fuses are used to customize the row decoding circuitry on a chip-by-chip basis. An algorithm is described for choosing which row address bits to decode given the defect map for a particular chip. This additional degree of freedom allows customization based on the defect map of a chip and increases the effectiveness of the proposed scheme in comparison to traditional memory repair. Experimental results show that, when compared with traditional schemes of similar complexity, the proposed scheme achieves a higher probability of repairing defects.","PeriodicalId":405651,"journal":{"name":"2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131913680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fault-Tolerant Routing Algorithm for Network on Chip without Virtual Channels","authors":"Y. Fukushima, Masaru Fukushi, S. Horiguchi","doi":"10.1109/DFT.2009.41","DOIUrl":"https://doi.org/10.1109/DFT.2009.41","url":null,"abstract":"Constructing 2D mesh topology network on chips (NoCs) without using virtual channels becomes attractive approach to building future massive multi-core computer systems because of its large amount of bandwidths, less design complexity, and less space consumption of routers. Dead lock problem on NoC is critical because it makes data transmission between nodes unreachable, and inevitable failures in hardware make mesh topology irregular. Although several fault-tolerant techniques are available, deadlock-free routing control algorithm for irregular mesh topology is promising approach to utilize large amount of bandwidths of NoC. The main drawback of available routing control algorithms is that many healthy nodes are deactivated to guarantee deadlock-freeness, and a number of deactivated nodes lead to traffic congestion. In this paper, we propose new fault-tolerant routing algorithm on 2D mesh topology NoC constructed without using virtual channels. The proposed algorithm is fully analyzed its dead lock-freeness, and the experimental result shows that the proposed algorithm can achieve both less number of deactivated nodes and higher throughput.","PeriodicalId":405651,"journal":{"name":"2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131279351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Reconfigurable ADC Circuit with Online-Testing Capability and Enhanced Fault Tolerance","authors":"Yueran Gao, Haibo Wang","doi":"10.1109/DFT.2009.31","DOIUrl":"https://doi.org/10.1109/DFT.2009.31","url":null,"abstract":"This paper investigates techniques to minimize process-variation induced performance degradation in pipeline ADCs via circuit reconfiguration. By taking advantage of the modularity existing in pipeline ADC circuits, this work introduces a configurable switch network that makes it possible to move more accurate pipeline circuits to the preceding stages along the signal processing path. The reconfiguration feature also adds online testing capabilities and enhances fault-tolerance of pipeline ADCs. An implementation of reconfigurable 10-bit 1.5-bit per stage pipeline ADC circuit is presented. Circuit simulation shows both improved circuit performance and fault tolerance are achieved by circuit reconfiguration.","PeriodicalId":405651,"journal":{"name":"2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131361040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}