Sungjae Kim, E. Shragowitz, G. Karypis, Rung-Bin Lin
{"title":"Interleaving of Gate Sizing and Constructive Placement for Predictable Performance","authors":"Sungjae Kim, E. Shragowitz, G. Karypis, Rung-Bin Lin","doi":"10.1109/VDAT.2007.373225","DOIUrl":"https://doi.org/10.1109/VDAT.2007.373225","url":null,"abstract":"This paper presents a fast fixed-die standard cell placement algorithm. Placement is achieved by a combination of top-down partitioning with the incremental row-by-row construction. This paper concentrates on the construction part of this process. Gate sizing is interleaved with the placement construction process. Before placement, every gate is given its minimal size. During the placement, gates are resized to satisfy the timing constraints. Behavior of the placement is adapted based on dynamically recomputed net delay bounds. Experimental results show significant improvement in timing, predictability of results, and run time with respect to a commercial placement tool.","PeriodicalId":137915,"journal":{"name":"2007 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115188024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On-Chip Bus Encoding for Power Minimization Under Delay Constraint","authors":"Tzu-Wei Lin, Shang-Wei Tu, Jing-Yang Jou","doi":"10.1109/VDAT.2007.373210","DOIUrl":"https://doi.org/10.1109/VDAT.2007.373210","url":null,"abstract":"As technology advances, the global interconnect delay and the power consumption of long wires become crucial issues in nanometer technologies. In particular, both inductive and capacitive coupling effects between wires result in serious problems such as crosstalk delay, coupling noise, and power consumption. In this paper, we propose a new bus encoding scheme for global bus design in nanometer technologies. With the user-given bus parameters, the working frequency, and the delay constraint, the scheme can minimize the bus power consumption subject to the delay constraint by effectively reducing the LC coupling effects.","PeriodicalId":137915,"journal":{"name":"2007 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126246995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Native-Mode Self Test for Embedded Systems on a Chip","authors":"J. Abraham","doi":"10.1109/VDAT.2007.373194","DOIUrl":"https://doi.org/10.1109/VDAT.2007.373194","url":null,"abstract":"Advances in semiconductor technology have enabled the integration of digital, mixed-signal, and RF systems on a single chip. While systems on a chip (SoCs) offer many benefits in cost and performance, they pose significant challenges for testing after manufacture. Trends in technology as well as applications which pose problems for conventional test will be described. A novel approach which uses the computational resources of an SoC to test itself will be described as a way to deal with emerging test problems. Techniques to test the embedded digital, analog and RF modules in the SoC will be discussed. Results of simulations and measurements on prototype hardware show that the approach can predict the specifications of the modules with high accuracy, pointing towards a new direction for low-cost manufacturing test of future products.","PeriodicalId":137915,"journal":{"name":"2007 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121831598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Efficient Design-for-Testability Scheme for Motion Estimation in H.264/AVC","authors":"Tung-Hsing Wu, Yi-Lin Tsai, Soon-Jyh Chang","doi":"10.1109/VDAT.2007.373255","DOIUrl":"https://doi.org/10.1109/VDAT.2007.373255","url":null,"abstract":"In this paper, a complete analysis for the input combinations of balanced and unbalanced adder trees based on C-testability conditions is presented. Based on the analysis, a simple and efficient design-for-testability scheme is proposed to implement the testable design for motion estimation (ME) circuit in H.264/AVC. The proposed testable scheme is applied to bit-level regular arrangement for the variable-block-size ME architecture. It guarantees 100% fault coverage with only 8 sets of test patterns. The proposed circuit design was synthesized with TSMC 0.13 mum technology. Simulation results show that the proposed design only increases about 6.5% area overhead compared to the original ME circuit with acceptable timing penalty.","PeriodicalId":137915,"journal":{"name":"2007 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125911870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 4-Bit, 13.5GSample/sec Track-and-Hold Circuit","authors":"I-Hsin Wang, Shen-Iuan Liu","doi":"10.1109/VDAT.2007.373231","DOIUrl":"https://doi.org/10.1109/VDAT.2007.373231","url":null,"abstract":"With the channel length scaled down in CMOS process technology, the clock feedthrough will degrade the performance of a high-speed track-and-hold (T/H) circuit due to the thin-oxide gate leakage and parasitic overlapping capacitance. A 13.5 GSample/sec CMOS T/H circuit with clock feedthrough and charge injection cancellation is proposed. This T/H circuit has been fabricated in 90 nm CMOS process. It achieves 4-bit resolution from 250 MHz to 5 GHz analog input signal at 13.5 GSample/sec and dissipates 89 mW from single IV supply voltage.","PeriodicalId":137915,"journal":{"name":"2007 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129219328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Testing Crosstalk Faults of Data and Address Buses in Embedded RAMs","authors":"Jiunn-Der Yu, Jin-Fu Li, Tsu-Wei Tseng","doi":"10.1109/VDAT.2007.373218","DOIUrl":"https://doi.org/10.1109/VDAT.2007.373218","url":null,"abstract":"Random access memories (RAMs) have many long parallel wires which incur a greater probability for excessive crosstalk coupling effect. This paper presents a test algorithm for detecting crosstalk faults of address and data buses in RAMs. The test algorithm requires 12n+2m+2 Read/Write operations to cover 100% crosstalk faults for a RAM with m-bit addresses, n-bit data inputs/outputs. A BIST supporting March-CW and the proposed test is also implemented. Experimental results show that the area cost of the BIST is only about 3.1% for an 8 Ktimes16-bit RAM based on TSMC 0.18 mum standard cell library.","PeriodicalId":137915,"journal":{"name":"2007 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125241269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Challenges for Low-power Embedded SOC's","authors":"T. Hattori","doi":"10.1109/VDAT.2007.373214","DOIUrl":"https://doi.org/10.1109/VDAT.2007.373214","url":null,"abstract":"Low power is one of the most important metrics in the embedded SOC's design. Many techniques and technologies for low-power design are developed and applied in the practical design projects. One of the difficulties in low-power design is the definition of power consumption. The power consumption of the LSI varies according to the operating function and data values. Current low-power techniques are focusing to reduce the power consumption using the characteristics of LSI behaviors. Some low-power techniques only supply the hardware features of low power and require the software control using these low-power features. This means that the system level approach including hardware features and software control is very important in low-power design. The final goal of low-power LSI is not the smallest power consumption of LSI but the long battery life, the low-cost cooling equipment, the small body, etc. of the embedded systems. Many low-power techniques, which are used in the practical SOC's like SH-mobile application processors for mobile phones will be discussed. And new challenges for low-power solution in system level will be also discussed.","PeriodicalId":137915,"journal":{"name":"2007 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128517949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Power-Saved 1Gbps Irregular LDPC Decoder based on Simplified Min-Sum Algorithm","authors":"Qi Wang, K. Shimizu, T. Ikenaga, S. Goto","doi":"10.1109/VDAT.2007.373219","DOIUrl":"https://doi.org/10.1109/VDAT.2007.373219","url":null,"abstract":"In this paper we proposed a fully-parallel irregular LDPC decoder which uses only registers to store the temporary intrinsic messages. Our decoder adopts a simplified min-sum algorithm to reduce the hardware implementation complexity and area, and due to the factor modification we achieve a negligible performance loss compared with the general min-sum algorithm. Considering reducing the power consumption, we also propose a power-saved strategy according to which the message evolution will halt as the parity-check condition is satisfied. This strategy will save us higher than 50% power under good channel condition. The synthesis result in 0.18 mum CMOS technology shows our decoder for (648,540) irregular LDPC code achieves high throughput (1 Gbps) with 9.0 ns latency.","PeriodicalId":137915,"journal":{"name":"2007 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130457155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and Implementation of Reconfigurable RSA Cryptosystem","authors":"Yun-Lu Chen, C. Tseng, Hsie-Chia Chang","doi":"10.1109/VDAT.2007.373258","DOIUrl":"https://doi.org/10.1109/VDAT.2007.373258","url":null,"abstract":"In this paper, the hardware implementation of a reconfigurable RSA cryptosystem is presented. In order to match distinct security levels, the modified Montgomery modular multiplication algorithm is introduced into this 512/1024/2048/4096-bits RSA encryption/decryption. The huge number of register is also replaced by 5 memory blocks. As a result, our design including 5 memory blocks achieves the baud rate of 99 kb/s for 512-bit, 29 kb/s for 1024-bit, 6.8fcs/6 for 2048-bit and 1.7 kb/s for 4096-bit on Xilinx Vertex2 XC2V8000 of 6783 slices.","PeriodicalId":137915,"journal":{"name":"2007 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130369264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Maximizing Full-Chip Simulation Signal Visibility for Efficient Debug","authors":"Y. Hsu","doi":"10.1109/VDAT.2007.373207","DOIUrl":"https://doi.org/10.1109/VDAT.2007.373207","url":null,"abstract":"The most expensive parts of today's system-on-chip (SoC) design flow are where engineers must engage in direct manual effort. Unfortunately, far too much time and money are wasted on tasks that don't add value -such as trying to figure out how supposedly-correct IP is actually working, debugging \"dumb\" errors, or deciding what signals to record in any given simulation run. With small block-level simulation, it is practical to record every value change on every signal. This produces a rich database of time-ordered event data that can be used for understanding the block's behavior and debugging errors. However, for large subsystem or full-chip level simulation, the overhead required to record all the events on all the signals overwhelms the run-time and fills the available disk space. Run times can explode by a factor of five. Disk requirements can run to the 100 s of gigabytes. The emergence of visibility enhancement technologies enables engineers to make intelligent tradeoffs between impact (simulation performance and file size) and visibility.","PeriodicalId":137915,"journal":{"name":"2007 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)","volume":"433 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133846781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}