{"title":"Adaptive sampling for efficient failure probability analysis of SRAM cells","authors":"J. Jaffari, M. Anis","doi":"10.1145/1687399.1687515","DOIUrl":"https://doi.org/10.1145/1687399.1687515","url":null,"abstract":"In this paper, an adaptive sampling method is proposed for the statistical SRAM cell analysis. The method is composed of two components. One part is the adaptive sampler that manipulates an alternative sampling distribution iteratively to minimize the estimated yield error. The drifts of the sampling distribution are re-configured in each iteration toward further minimization of the estimation variance by using the data obtained from the previous circuit simulations and applying a high-order Householder's method. Secondly, an analytical framework is developed and integrated with the adaptive sampler to further boost the efficiency of the method. This is achieved by the optimal initialization of the alternative multi-variate Gaussian distribution via setting its drift vector and covariance matrix. The required number of simulation iterations to obtain the yield with a certain accuracy is several orders of magnitude lower than that of the crude-Monte Carlo method with the same confidence interval.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129043062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved heuristics for finite word-length polynomial datapath optimization","authors":"B. Alizadeh, M. Fujita","doi":"10.1145/1687399.1687536","DOIUrl":"https://doi.org/10.1145/1687399.1687536","url":null,"abstract":"Conventional high-level synthesis techniques are not able to manipulate polynomial expressions efficiently due to the lack of suitable optimization techniques for redundancy elimination over Z2 n. This paper, in comparison with, presents 1) an improved partitioning heuristic based on single-variable monomials instead of checking all sub-polynomials, 2) an improved compensation heuristic which is able to compensate monomials as well as coefficients, and 3) a combined area-delay-optimized factorization approach to extract the most frequently used sub-expressions from multi-output polynomials over Z2 n. Experimental results have shown an average saving of 32% and 27.2% in the number of logic gates and critical path delay respectively compared to the state-of-the-art techniques. Regarding the comparison with, the number of gates and delay are improved by 14.3% and 13.9% respectively. Furthermore, the results show that the combined area-delay optimization can reduce the average delay by 26.4%.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127914496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Taming irregular EDA applications on GPUs","authors":"Yangdong Deng, Bo D. Wang, Shuai Mu","doi":"10.1145/1687399.1687501","DOIUrl":"https://doi.org/10.1145/1687399.1687501","url":null,"abstract":"Recently general purpose computing on graphic processing units (GPUs) is rising as an exciting new trend in high-performance computing. Thus it is appealing to study the potential of GPU for Electronic Design Automation (EDA) applications. However, EDA generally involves irregular data structures such as sparse matrix and graph operations, which pose significant challenges for efficient GPU implementations. In this paper, we propose high-performance GPU implementations for two important irregular EDA computing patterns, Sparse-Matrix Vector Product (SMVP) and graph traversal. On a wide range of EDA problem instances, our SMVP implementations outperform all published work and achieve a speedup of one order of magnitude over the CPU baseline. Upon such a basis, both timing analysis and linear system solution can be considerably accelerated. We also introduce a SMVP based formulation for Breadth-First Search and observe considerable speedup on GPU implementations. Our results suggest that the power of GPU computing can be successfully unleashed through designing GPU-friendly algorithms and/or re-organizing computing structures of current algorithms.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121121867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Castrillón, Diandian Zhang, T. Kempf, B. Vanthournout, R. Leupers, G. Ascheid
{"title":"Task management in MPSoCs: An ASIP approach","authors":"J. Castrillón, Diandian Zhang, T. Kempf, B. Vanthournout, R. Leupers, G. Ascheid","doi":"10.1145/1687399.1687508","DOIUrl":"https://doi.org/10.1145/1687399.1687508","url":null,"abstract":"Scheduling, mapping and synchronization have an essential impact on the performance of Multi-Processor System-on-Chips (MPSoCs), especially in heterogeneous systems with many cores and small tasks. This paper presents a technique to efficiently accelerate these operations. Key contribution is an Application-Specific Instruction-set Processor (ASIP) called OSIP which is especially tailored to achieve this. In contrast to pure HW solutions, OSIP is programmable and hence features higher flexibility and better scalability. OSIP comes with a compiler and a firmware that ease its usability, and an abstract formal model that allows analytical evaluation and integration into fast system level simulators. Together with OSIP, a thin software layer is proposed that leverages high level multi-task programming by abstracting OSIP's low level details away. In an extensive case study based on a synthetic benchmark and a benchmark from the multimedia domain (H.264), OSIP highlights its potential when compared against a standard RISC and an ARM926-EJS processor.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"110 16","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120820752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Binning optimization based on SSTA for transparently-latched circuits","authors":"Min-Sik Gong, H. Zhou, Jun Tao, Xuan Zeng","doi":"10.1145/1687399.1687462","DOIUrl":"https://doi.org/10.1145/1687399.1687462","url":null,"abstract":"With increasing process variation, binning has become an important technique to improve the values of fabricated chips, especially in high performance microprocessors where transparent latches are widely used. In this paper, we formulate and solve the binning optimization problem that decides the bin boundaries and their testing order to maximize the benefit (considering the test cost) for a transparently-latched circuit. The problem is decomposed into three sub-problems which are solved sequentially. First, to compute the clock period distribution of the transparently-latched circuit, a sample-based SSTA approach is developed which is based on the generalized stochastic collocation method (gSCM) with Sparse Grid technique. The minimal clock period on each sample point is found by solving a minimal cycle ratio problem in the constraint graph. Second, a greedy algorithm is proposed to maximize the sales profit by iteratively assigning each boundary to its optimal position. Then, an optimal algorithm of O(n log n) runtime is used to generate the optimal testing order of bin boundaries to minimize the test cost, based on alphabetic tree. Experiments on all the ISCAS'89 sequential benchmarks with 65-nm technology show 6.69% profit improvement and 14.00% cost reduction in average. The results also demonstrate that the proposed SSTA method achieves an error of 0.70% and speedup of 110X in average compared with the Monte Carlo simulation. Categories and Subject Descriptors: J.6 [Computer-Aided Engineering]: Computer-Aided Design General Terms: Design, Algorithms","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130757869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-functional interconnect co-optimization for fast and reliable 3D stacked ICs","authors":"Young-Joon Lee, Rohan Goel, S. Lim","doi":"10.1145/1687399.1687519","DOIUrl":"https://doi.org/10.1145/1687399.1687519","url":null,"abstract":"Heat removal and power delivery have become two major reliability concerns in 3D stacked IC technology. For thermal problem, two possible solutions exist: thermal-through-silicon-vias (T-TSVs) and micro-fluidic channel (MFC) based liquid cooling. In case of power delivery, a highly complex power distribution network is required to deliver currents reliably to all parts of the 3D stacked IC while suppressing the power supply noise to an acceptable level. However, these thermal and power networks pose major challenges in signal routability and congestion. This is because the signal, power, and thermal interconnects are all competing for routing space. In this paper, we present a co-optimization methodology for the signal, power, and thermal interconnects for 3D stacked ICs based on design of experiments (DOE) and response surface method (RSM). The goal is to improve performance, thermal, noise, and congestion metrics with our holistic approach. We also provide in-depth comparison between T-TSV and MFC based cooling method and discuss how to employ DOE and RSM to best co-optimize the multi-functional interconnects simultaneously. Categories and Subject Descriptors B.7.2 [Hardware]: Integrated Circuits-Design Aids General Terms Design, Reliability","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131714454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal layer assignment for escape routing of buses","authors":"Tan Yan, Hui Kong, Martin D. F. Wong","doi":"10.1145/1687399.1687444","DOIUrl":"https://doi.org/10.1145/1687399.1687444","url":null,"abstract":"Escape routing is a critical problem in PCB design. In IC-CAD'07, a layer assignment algorithm was proposed for escape routing of buses. The algorithm is optimal for single layer design in the sense that it determines if a set of buses can all be escaped on one layer. If they cannot, the algorithm is able to select a maximum subset of the buses that can be escaped on one layer. This, in turn, leads to a heuristic for the layer assignment problem with multiple layers, which is to repeatedly assign a maximum subset of the unassigned buses to a new layer. In this work, we present an algorithm that solves the multi-layer layer assignment problem optimally. Our algorithm guarantees to produce a layer assignment with minimum number of layers. We applied our algorithm on industrial data and obtained encouraging results. Categories and Subject Descriptors B.7.2 [Integrated Circuits]: Design Aids-Placement and Routing General Terms Algorithm, Theory","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"311 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113956032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Post-fabrication measurement-driven oxide breakdown reliability prediction and management","authors":"Cheng Zhuo, D. Blaauw, D. Sylvester","doi":"10.1145/1687399.1687482","DOIUrl":"https://doi.org/10.1145/1687399.1687482","url":null,"abstract":"Oxide breakdown has become an increasingly pressing reliability issue in modern VLSI design with ultra-thin oxides. The conventional guard-band methodology assumes uniformly thin oxide thickness and results in overly pessimistic reliability estimation that severely degrades the system performance. In this study we present the use of limited post-fabrication measurements of oxide thicknesses from on-chip sensors to aid in the chip-level oxide breakdown reliability prediction and quantify the trade-off between reliability margin and system performance. Given the post-fabrication measurements, chip oxide breakdown reliability can be formulated as a conditional distribution that allows us to achieve a significantly more accurate chip lifetime estimation. The estimation is then used to individually tune the supply voltage of each chip for performance maximization while maintaining or improving the reliability. Experimental results show that the proposed method can achieve performance improvement of 19% on average and 27% at maximum for a design with up to 50 million devices, using merely 25 measurements per chip, while analysis time is only 0.4 second. Categories and Subject Descriptors B.7.2 [Hardware]: Integrated Circuits-design aids General Terms Performance, Algorithms","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114608113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Operating system scheduling for efficient online self-test in robust systems","authors":"Yanjing Li, O. Mutlu, S. Mitra","doi":"10.1145/1687399.1687436","DOIUrl":"https://doi.org/10.1145/1687399.1687436","url":null,"abstract":"Very thorough online self-test is essential for overcoming major reliability challenges such as early-life failures and transistor aging in advanced technologies. This paper demonstrates the need for operating system (OS) support to efficiently orchestrate online self-test in future robust systems. Experimental data from an actual dual quad-core system demonstrate that, without software support, online self-test can significantly degrade performance of soft real-time and computation-intensive applications (by up to 190%), and can result in perceptible delays for interactive applications. To mitigate these problems, we develop OS scheduling techniques that are aware of online self-test, and schedule/migrate tasks in multi-core systems by taking into account the unavailability of one or more cores undergoing online self-test. These techniques eliminate any performance degradation and perceptible delays in soft real-time and interactive applications (otherwise introduced by online self-test), and significantly reduce the impact of online self-test on the performance of computation-intensive applications. Our techniques require minor modifications to existing OS schedulers, thereby enabling practical and efficient online self-test in real systems.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123999234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Tiwary, Anubhav Gupta, J. Phillips, C. Pinello, R. Zlatanovici
{"title":"First steps towards SAT-based formal analog verification","authors":"S. Tiwary, Anubhav Gupta, J. Phillips, C. Pinello, R. Zlatanovici","doi":"10.1145/1687399.1687401","DOIUrl":"https://doi.org/10.1145/1687399.1687401","url":null,"abstract":"Boolean satisfiability (SAT) based methods have traditionally been popular for formally verifying properties for digital circuits. We present a novel methodology for formulating a SPICE-type circuit simulation problem as a satisfiability problem. We start with a circuit level netlist, capture the non-linear behavior of the circuits at the transistor level via conservative approximations and transform the simulation problem into a search problem that can be exhaustively explored via a SAT solver. Thus, for DC as well as fixed time-step based transient and periodic steady state (PSS) simulation formulations, the solutions produced by the solver are formal in nature. We also present algorithms for abstraction refinement and smart interval generation to improve the computational efficiency of our proposed solution scheme. We have implemented our ideas into a tool called fSpice which is the first attempt at building a formal SPICE engine. We demonstrate the applicability of our ideas by showing experimental results using pruned versions of real designs that faced challenges during chip tape-out.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"34 Suppl 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116600059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}