Donghwa Shin, Jihun Kim, N. Chang, J. Choi, S. Chung, Eui-Young Chung
{"title":"Energy-optimal dynamic thermal management for green computing","authors":"Donghwa Shin, Jihun Kim, N. Chang, J. Choi, S. Chung, Eui-Young Chung","doi":"10.1145/1687399.1687520","DOIUrl":"https://doi.org/10.1145/1687399.1687520","url":null,"abstract":"Existing thermal management systems for microprocessors assume that the thermal resistance of the heat-sink is constant and that the objective of the cooling system is simply to avoid thermal emergencies. But in fact the thermal resistance of the usual forced-convection heat-sink is inversely proportional to the fan speed, and a more rational objective is to minimize the total power consumption of both processor and cooling system. Our new method of dynamic thermal management uses both the fan speed and the voltage/frequency of the microprocessor as control variables. Experiments show that tracking the energy-optimal steady-state temperature can saves up to 17.6% of the overall energy, when compared with a conventional approach that merely avoids overheating.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114838108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Timing model extraction for sequential circuits considering process variations","authors":"Bing Li, Ning Chen, Ulf Schlichtmann","doi":"10.1145/1687399.1687463","DOIUrl":"https://doi.org/10.1145/1687399.1687463","url":null,"abstract":"As semiconductor devices continue to scale down, process variations become more relevant for circuit design. Facing such variations, statistical static timing analysis is introduced to model variations more accurately so that the pessimism in traditional worst case timing analysis is reduced. Because all delays are modeled using correlated random variables, most statistical timing methods are much slower than corner based timing analysis. To speed up statistical timing analysis, we propose a method to extract timing models for flip-flop and latch based sequential circuits respectively. When such a circuit is used as a module in a hierarchical design, the timing model instead of the original circuit is used for timing analysis. The extracted timing models are much smaller than the original circuits. Experiments show that using extracted timing models accelerates timing verification by orders of magnitude compared to previous approaches using flat netlists directly. Accuracy is maintained, however, with the mean and standard deviation of the clock period both showing usually less than 1% error compared to Monte Carlo simulation on a number of benchmark circuits.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117031250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MOLES: Malicious off-chip leakage enabled by side-channels","authors":"Lang Lin, W. Burleson, C. Paar","doi":"10.1145/1687399.1687425","DOIUrl":"https://doi.org/10.1145/1687399.1687425","url":null,"abstract":"Economic incentives have driven the semiconductor industry to separate design from fabrication in recent years. This trend leads to potential vulnerabilities from untrusted circuit foundries to covertly implant malicious hardware Trojans into a genuine design. Hardware Trojans provide back doors for on-chip manipulation, or leak secret information off-chip once the compromised IC is deployed in the field. This paper explores the design space of hardware Trojans and proposes a novel technique, “Malicious Off-chip Leakage Enabled by Side-channels” (MOLES), which employs power side-channels to convey secret information off-chip. An experimental MOLES circuit is designed with fewer than 50 gates and is embedded into an Advanced Encryption Standard (AES) cryptographic circuit in a predictive 45nm CMOS technology model. Engineered by a spread-spectrum technique, the MOLES technique is capable of leaking multi-bit information below the noise power level of the host IC to evade evaluators' detections. In addition, a generalized methodology for a class of MOLES circuits and design verification by statistical correlation analysis are presented. The goal of this work is to demonstrate the potential threats of MOLES on embedded system security. Nevertheless, MOLES could be constructively used for hardware authentication, fingerprinting and IP protection.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117249330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jarrod A. Roy, Natarajan Viswanathan, Gi-Joon Nam, C. Alpert, I. Markov
{"title":"CRISP: Congestion reduction by iterated spreading during placement","authors":"Jarrod A. Roy, Natarajan Viswanathan, Gi-Joon Nam, C. Alpert, I. Markov","doi":"10.1145/1687399.1687467","DOIUrl":"https://doi.org/10.1145/1687399.1687467","url":null,"abstract":"Dramatic progress has been made in algorithms for placement and routing over the last 5 years, with improvements in both speed and quality. Combining placement and routing into a joint optimization has also been proposed. However, it remains unclear if the benefits would be significant enough to justify major changes in commercial tools. CRISP addresses this challenge and is the first tool to demonstrate tangible benefits of combined place-and-route optimization including fewer global routing detours, reduced detailed routing violations and runtime, and even shrinking the floorplan of a commercial design. We employ fast global routing to choose standard cells to temporarily inflate and iteratively spread for congestion reduction. Spreading only in congested regions, we enable die area reduction by facilitating routing with high area utilization.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128737868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-level clustering for clock skew optimization","authors":"J. Casanova, J. Cortadella","doi":"10.1145/1687399.1687502","DOIUrl":"https://doi.org/10.1145/1687399.1687502","url":null,"abstract":"Clock skew scheduling has been effectively used to reduce the clock period of sequential circuits. However, this technique may become impractical if a different skew must be applied for each memory element. This paper presents a new technique for clock skew scheduling constrained by the number of skew domains. The technique is based on a multi-level clustering approach that progressively groups flip-flops with skew affinity. This new technique has been compared with previous work, showing the efficiency in the obtained performance and computational cost. As an example, the skews for an OpenSparc with almost 16 K flip-flops and 500 K paths have been calculated in less than 5 minutes when using only 2 to 5 skew domains.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129607682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging efficient parallel pattern search for clock mesh optimization","authors":"Xiaoji Ye, S. Narasimhan, Peng Li","doi":"10.1145/1687399.1687499","DOIUrl":"https://doi.org/10.1145/1687399.1687499","url":null,"abstract":"Mesh-based clock distribution network has been employed in many high-performance microprocessor designs due to its favorable properties such as low clock skew and robustness. Such clock distributions are usually highly complex. While the simulation of clock meshes is already time consuming, tuning such networks under tight performance constraints is a more daunting task. In this paper, we address the challenging task of driver size optimization with a goal of skew minimization. The expensive objective function evaluations and difficulty in getting explicit sensitivity information make this problem intractable to standard optimization methods. We propose to explore the recently developed asynchronous parallel pattern search (APPS) method for efficient driver size tuning. While being a search-based method, APPS not only provides the desirable derivative-free optimization capability, but is also amenable to parallelization and possesses appealing theoretically rigorous convergence properties. We show how such a method can lead to powerful parallel sizing optimization of large clock meshes with significant runtime and quality advantages over the traditional sequential quadratic programming (SQP) method. We also show how design-specific properties and speeding-up techniques can be exploited to make the optimization even more efficient while maintaining the convergence of APPS in a practical sense.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128809228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast and reliable passivity assessment and enforcement with extended Hamiltonian pencil","authors":"Zuochang Ye, L. M. Silveira, J. Phillips","doi":"10.1145/1687399.1687542","DOIUrl":"https://doi.org/10.1145/1687399.1687542","url":null,"abstract":"Passivity is an important property for a macro-model generated from measured or simulated data. Existence of purely imaginary eigenvalues of a Hamiltonian matrix provides useful information in assessing and correcting the passivity of a system. Since direct computation of eigenvalues is very expensive for large-scale systems, several authors have proposed to solve iteratively for a subset of the eigenvalues based on heuristic sampling along the imaginary axis. However, completeness is not guaranteed in such methods and thus potential risk of missing important eigenvalues is difficult to avoid. In this paper we are aiming at finding all eigenvalues efficiently to avoid both the high cost and the potential risk of missing important eigenvalues. The idea of the proposed method is to convert the Hamiltonian matrix to an equivalent sparse form, termed the “extended Hamiltonian pencil”, and solve for its eigenvalues efficiently using a special eigensolver. Experiments on several realistic systems demonstrate an 80X speed-up compared with standard direct eigensolvers.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"224 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122486179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gene-regulatory memories: Electrical-equivalent modeling, simulation and parameter identification","authors":"Yong Zhang, Peng Li","doi":"10.1145/1687399.1687492","DOIUrl":"https://doi.org/10.1145/1687399.1687492","url":null,"abstract":"The development of gene-regulatory memory circuits provides key understandings of biological information storage and enables new biological applications. Computer models and simulations can provide quantitative analysis and prediction of the behaviors and functions of genetic networks, thereby providing valuable verification and design guidance. In this paper, we model the nonlinear dynamics associated with various chemical reactions in gene-regulatory memory networks using chemical reaction equations. These reaction equations are mapped into a set of electrical-equivalent models and the network is simulated by an extended SPICE-like circuit simulation environment. Furthermore, we address the practical difficulty in direct characterization of network model parameters by developing a simulation-driven Bayesian framework for parameter identification. To ensure the reliable identification of key system properties, we propose a two-step structure-preserving parameter identification approach. The first step infers bistability, the most critical characteristics of a memory device; and the second step is geared towards identifying dynamical properties of the network while maintaining the identified bistability. We demonstrate the proposed approaches through extensive simulations that well agree with established biological understandings and identified networks that recreate measured circuit responses in a statistical sense.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127595832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dara Rahmati, S. Murali, L. Benini, F. Angiolini, G. Micheli, H. Sarbazi-Azad
{"title":"A method for calculating hard QoS guarantees for Networks-on-Chip","authors":"Dara Rahmati, S. Murali, L. Benini, F. Angiolini, G. Micheli, H. Sarbazi-Azad","doi":"10.1145/1687399.1687507","DOIUrl":"https://doi.org/10.1145/1687399.1687507","url":null,"abstract":"Many Networks-on-Chip (NoC) applications exhibit one or more critical traffic flows that require hard Quality of Service (QoS). Guaranteeing bandwidth and latency for such real time flows is crucial. In this paper, we present novel methods to efficiently calculate worst-case bandwidth and latency bounds and thereby provide hard QoS guarantees. Importantly, the proposed methods apply even to best-effort NoC architectures, with no extra hardware dedicated to QoS support. By applying our methods to several realistic NoC designs, we show substantial improvements (on average, more than 30% in bandwidth and 50% in latency) in bound tightness with respect to existing approaches.","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131793875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiniun Xionq, Yiyu Shi, V. Zolotov, C. Visweswariah
{"title":"Pre-ATPG path selection for near optimal post-ATPG process space coverage","authors":"Jiniun Xionq, Yiyu Shi, V. Zolotov, C. Visweswariah","doi":"10.1145/1687399.1687419","DOIUrl":"https://doi.org/10.1145/1687399.1687419","url":null,"abstract":"Path delay testing is becoming increasingly important for high-performance chip testing in the presence of process variation. To guarantee full process space coverage, the ensemble of critical paths of all chips irrespective of their manufacturing process conditions needs to be tested, as different chips may have different critical paths. Existing coverage-based path selection techniques, however, suffer from the loss of coverage after ATPG (automatic test pattern generation), i.e., although the pre-ATPG path selection achieves good coverage, after ATPG, the coverage can be severely reduced as many paths turn out to be unsensitizable. This paper presents a novel path selection algorithm that, without running ATPG, selects a set of n paths to achieve near optimal post-ATPG coverage. Details of the algorithm and its optimality conditions are discussed. Experimental results show that, compared to the state-of-the-art, the proposed algorithm achieves not only superior post-ATPG coverage, but also significant runtime speedup. Categories and Subject Descriptors B.7.2 [Integrated Circuits]: Design Aids General Terms Algorithms, Design, Theory","PeriodicalId":256358,"journal":{"name":"2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116053152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}