{"title":"Electromagnetic parasitic extraction via a multipole method with hierarchical refinement","authors":"M. Beattie, L. Pileggi","doi":"10.1109/ICCAD.1999.810690","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810690","url":null,"abstract":"The increasing interconnect density and operating frequencies of system-on-a-chip (SOC) designs necessitates extraction of parasitic electromagnetic couplings beyond the localized confines of functional design blocks. In addition, SOC design styles and gridless variable-width routing make it increasingly difficult to use precharacterized library shapes for parasitic extraction. A comprehensive capacitance and inductance extraction solution requires a hierarchical data representation and fast runtime algorithms. We illustrate through examples that both the multipole method and hierarchical refinement, which are the two most successful approaches for parasitic extraction to date, work efficiently only under certain, limiting conditions. To improve this situation we present an approach which combines the best of both methods into a concurrent multipole refinement representation of the electromagnetic interaction which is efficient for arbitrary interconnect configurations. We use a generalized formulation of electromagnetic interactions to exploit the similarities in capacitance and inductance extraction for greater efficiency.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"14 1","pages":"437-444"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73806061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Repeater insertion in tree structured inductive interconnect","authors":"Y. Ismail, E. Friedman, J. Neves","doi":"10.1007/978-1-4615-1685-9_9","DOIUrl":"https://doi.org/10.1007/978-1-4615-1685-9_9","url":null,"abstract":"","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"32 1","pages":"420-424"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74288501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"JMTP: an architecture for exploiting concurrency in embedded Java applications with real-time considerations","authors":"Rachid Helaihel, K. Olukotun","doi":"10.1109/ICCAD.1999.810710","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810710","url":null,"abstract":"Using Java in embedded systems is plagued by problems of limited runtime performance and unpredictable runtime behavior. The Java Multi-Threaded Processor (JMTP) provides solutions to these problems. The JMTP architecture is a single chip containing an off-the-shelf general purpose processor core coupled with an array of Java Thread Processors (JTPs). Performance can be improved using this architecture by exploiting coarse-grained parallelism in the application. These performance improvements are achieved with relatively small hardware costs. Runtime predictability is improved by implementing a subset of the Java Virtual Machine (JVM) specification in the JTP and trimming away complexity without excessively restricting the Java code a JTP can handle. Moreover the JMTP architecture incorporates hardware to adaptively manage shared JMTP resources in order to satisfy JTP thread timing constraints or provide an early warning for a timing violation. This is an important feature for applications with quality-of-service demands. In addition to the hardware architecture, we describe a software framework that analyzes a Java application for expressed and implicit coarse-grained concurrent threads to execute on JTPs. This framework identifies the optimal mapping of an application to a JMTP with an arbitrary number of JTPs. We have tested this framework on a variety of applications including IDEA encryption with different JTP configurations and confirmed that the algorithm was able to obtain desired results in each case.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"15 1","pages":"551-557"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73699571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Realizable reduction for RC interconnect circuits","authors":"A. Devgan, P. O'Brien","doi":"10.1109/ICCAD.1999.810650","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810650","url":null,"abstract":"Interconnect reduction is an important step in the design and analysis of complex interconnects found in present-day integrated circuits. This paper presents techniques for obtaining realizable and accurate reduced models for two-port and multi-port RC circuits. The proposed method is also particularly suitable for interconnect reduction for nonlinear circuit simulation and for interconnect post-processing in a parasitic extractor. The method has two limitations. First, it only considers the first few moments of the transfer function; however that is accurate enough for RC circuits. Second, the amount of interconnect reduction is topology dependent. Although, most on-chip interconnect topologies are well suited for the method proposed. Accuracy and efficiency of the proposed method is demonstrated for various realistic examples.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"13 1","pages":"204-207"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88816870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Chebyshev expansion based passive model for distributed interconnect networks","authors":"Janet Roveda, E. Kuh, Qingjian Yu","doi":"10.1109/ICCAD.1999.810677","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810677","url":null,"abstract":"A new Chebyshev expansion based model for distributed interconnect networks is presented in this paper. Unlike the moment methods, this new model is optimal and it does not require the knowledge of expansion points. An automatic order selection scheme is also included in the new model. By using the integrated congruence transform, we guarantee the passivity of the new model for distributed interconnect networks. Because of the orthogonality of Chebyshev polynomials, the Modified Gram-Schmidt algorithm can be simplified. In the experimental examples, the new model is found to be accurate and efficient.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"10 1","pages":"370-375"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78842224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christoph Albrecht, B. Korte, Jürgen Schietke, J. Vygen
{"title":"Cycle time and slack optimization for VLSI-chips","authors":"Christoph Albrecht, B. Korte, Jürgen Schietke, J. Vygen","doi":"10.1109/ICCAD.1999.810654","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810654","url":null,"abstract":"We consider the problem of finding an optimal clock schedule, i.e. optimal arrival times for clock signals at latches of a VLSI chip. We describe a general model which includes all previously considered models. Then we show how to optimize the cycle time and optimally balance slacks on data paths and on clocktree paths. The problem of finding a clock schedule with the optimum cycle time was solved before, either by linear programming or by binary search, using a test for negative circuits in a digraph as a subroutine. We show for the first time that a direct combinatorial algorithm solves this problem optimally. Incidentally, this yields a new efficient method for timing analysis with transparent latches. Moreover, we extend this algorithm to the slack balancing problem: To make the chip less sensitive to routing detours, process variations and manufacturing skew it is desirable to have as few critical paths as possible. We show how to find the clock schedule with minimum number of critical paths (optimum slack distribution) in a well-defined sense. Rather than fixed dock arrival times we show how to obtain as large as possible intervals for the clock arrival times. This can be considered as slack on clocktree paths. Indeed, we can find the global optimum of simultaneous optimization of slacks on all data paths and clocktree paths. All the above is done by very efficient network optimization algorithms, based on parametric shortest paths. Our computational results with recent IBM processor chips show that the number of critical paths decreases dramatically, in addition to a considerable improvement of the cycle time. The running times are reasonable even for the largest designs.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"1 1","pages":"232-238"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78575780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Regularity extraction via clan-based structural circuit decomposition","authors":"S. Hassoun, C. McCreary","doi":"10.1109/ICCAD.1999.810686","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810686","url":null,"abstract":"Identifying repeating structural regularities in circuits allows the minimization of synthesis, optimization, and layout efforts. We introduce in this paper a novel method for identifying a set of repeating circuit structures, referred to as templates, and we report on using an efficient binate cover solver to select an appropriate subset of templates with which to cover the circuit. Our approach is comprised of three steps. First, the circuit graph is decomposed in a hierarchical inclusion parse tree using a clan-based decomposition algorithm. This algorithm discovers clans, grouping of nodes in the circuit graph that have a natural affinity towards each other. Second, the parse tree nodes are classified into equivalence classes. Such classes represent templates suitable for circuit covering. The final step consists of using a binate cover solver to find an appropriate cover. The cover will consist of instantiated templates and gates that cannot be covered by any templates. We describe the results of applying this algorithm to several circuits, and show that the algorithm is effective in extracting structural regularity.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"34 1","pages":"414-418"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80047337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Synchronous equivalence for embedded systems: a tool for design exploration","authors":"H. Hsieh, F. Balarin","doi":"10.1109/ICCAD.1999.810702","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810702","url":null,"abstract":"Design exploration consists of analyzing several alternative implementations of the \"same\" function to determine the most desirable one. A fundamental question is whether an \"implementation\" is consistent with the high-level specification or whether two implementations are \"equivalent\". We define synchronous equivalence for embedded systems that strongly resembles the concept of functional equivalence for sequential circuits. We then present equivalence analysis algorithms that are of low polynomial complexity. We show an example of application of the algorithms to a real-life design (a shock absorber controller) and demonstrate that synchronous equivalence opens up design exploration avenues uncharted before.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"48 1","pages":"505-509"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79981616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Function unit specialization through code analysis","authors":"Dan Benyamin, W. Mangione-Smith","doi":"10.1109/ICCAD.1999.810658","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810658","url":null,"abstract":"Many previous attempts at ASIP (application-specific instruction set processor) synthesis have employed template matching techniques to target function units to application code, or directly design new units to extract maximum performance. This paper presents an entirely new approach to specializing hardware for application-specific needs. In our framework of a parameterized VLIW processor, we use a post-modulo scheduling analysis to reduce the allocated hardware resources while increasing the code's performance. Initial results indicate significant savings in area, as well as optimizations to increase FIR filter code performance by 200% to 300%.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"16 1","pages":"257-260"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80121387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Functional timing optimization","authors":"A. Saldanha","doi":"10.1109/ICCAD.1999.810708","DOIUrl":"https://doi.org/10.1109/ICCAD.1999.810708","url":null,"abstract":"It is widely believed that any true (or sensitizable) critical path of length /spl ges/2 T must be speeded up in order for a circuit to have a delay <T. I demonstrate that this notion is pessimistic. Many true paths can never affect the delay of the circuit-whenever such a path propagates a signal, some other path that is at least as long also propagates a signal. The theory for a new classification of paths based on the impact on the circuit delay is presented and conditions are given under which a path (or a set of paths) must be speeded up in order to improve the circuit delay. The conditions for the categorization are independent of the delays in the circuit and are valid for all delay assignments. This work indicates that the widely employed notions of true and false paths may be misleading both for timing optimization and delay analysis of logic circuits.","PeriodicalId":6414,"journal":{"name":"1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051)","volume":"31 1","pages":"539-543"},"PeriodicalIF":0.0,"publicationDate":"1999-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83120593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}