{"title":"Bisection Based Placement for the X Architecture","authors":"S. Ono, S. Tilak, P. Madden","doi":"10.1109/ASPDAC.2007.357978","DOIUrl":"https://doi.org/10.1109/ASPDAC.2007.357978","url":null,"abstract":"Rising interconnect delay and power consumption have motivated the investigation of alternative integrated circuit routing architectures. In particular, the X architecture, which features preferred routing in diagonal directions, has gained a measure of industry support, and has even been validated at 65nm. While there has been extensive study of Manhattan design methods, there are markedly fewer published results for non-Manhattan design. To help fill this gap, we study a patented placement method for the X architecture; to our knowledge, there have been no prior published results for the method. Surprisingly, we find that the patented method in fact performs worse than traditional Manhattan methods - for both Manhattan and X routing metrics. We also present a theoretic formulation which explains why solution quality is degraded. Many groups in industry are evaluating the merits of non-Manhattan routing architectures. By providing concrete experimental results, we hope to improve the accuracy of these evaluations.","PeriodicalId":362373,"journal":{"name":"2007 Asia and South Pacific Design Automation Conference","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121759088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recognition of Fanout-free Functions","authors":"Tsung-Lin Lee, Chun-Yao Wang","doi":"10.1109/ASPDAC.2007.358023","DOIUrl":"https://doi.org/10.1109/ASPDAC.2007.358023","url":null,"abstract":"Factoring is a logic minimization technique to represent a Boolean function in an equivalent function with minimum literals. When realizing the circuit, a function represented in a more compact form has smaller area. Some Boolean functions even have equivalent forms where each variable appears exactly once, which are known as fanout-free functions. John P. Hayes (Hayes, 1975) had devised an algorithm to determine if a function can be fanout-free and construct the circuit if fanout-free realization exists. In this paper, we propose a property and an efficient technique to accelerate this algorithm. With our improvements, execution time of this algorithm is more competitive with the state-of-the-art method (Golumbic, 2001).","PeriodicalId":362373,"journal":{"name":"2007 Asia and South Pacific Design Automation Conference","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130018254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Matsunaga, T. Hanyu, H. Kimura, Takashi Nakamura, H. Takasu
{"title":"Implementation of a Standby-Power-Free CAM Based on Complementary Ferroelectric-Capacitor Logic","authors":"S. Matsunaga, T. Hanyu, H. Kimura, Takashi Nakamura, H. Takasu","doi":"10.1109/ASPDAC.2007.357968","DOIUrl":"https://doi.org/10.1109/ASPDAC.2007.357968","url":null,"abstract":"A complementary ferroelectric-capacitor (CFC) logic-circuit style is proposed for a compact and standby-power-free content-addressable memory (CAM). Since the use of the CFC logic circuit in designing a CAM cell makes it possible to merge both logic and non-volatile storage elements into serially connected ferroelectric capacitors, the CAM becomes compact. The standby power of the CAM is completely eliminated because the supply voltage can be cut off with maintaining stored data in the CAM. The test chip is fabricated by using 0.35-mum ferroelectric CMOS, and the basic behavior can be also measured.","PeriodicalId":362373,"journal":{"name":"2007 Asia and South Pacific Design Automation Conference","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130214456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Parameterized Architecture Model in High Level Synthesis for Image Processing Applications","authors":"Yazhuo Dong, Y. Dou","doi":"10.1109/ASPDAC.2007.358039","DOIUrl":"https://doi.org/10.1109/ASPDAC.2007.358039","url":null,"abstract":"Most image processing applications are computationally intensive and data intensive. Reconfigurable hardware boards provide a convenient and flexible solution to speed up these algorithms. To get a high performance design without going through the time-consuming hardware design process for each different algorithm, we present a universal parameterized architecture in high level synthesis to generate the hardware frames for all image processing applications automatically. The value of the parameters which decide the target architecture can be obtained from the compiler. The algorithm how to get these parameters is also discussed in this paper.","PeriodicalId":362373,"journal":{"name":"2007 Asia and South Pacific Design Automation Conference","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122450698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ahmad A. Al-Yamani, Narendra Devta-Prasanna, A. Gunda
{"title":"Systematic Scan Reconfiguration","authors":"Ahmad A. Al-Yamani, Narendra Devta-Prasanna, A. Gunda","doi":"10.1109/ASPDAC.2007.358075","DOIUrl":"https://doi.org/10.1109/ASPDAC.2007.358075","url":null,"abstract":"We present a new test data compression technique that achieves 10times to 40times compression ratios without requiring any information from the ATPG tool about the unspecified bits. The technique is applied to both single-stuck as well as transition fault test sets. The technique allows aggressive parallelization of scan chains leading to similar reduction in test time. It also reduces tester pins requirements by similar ratios. The technique is implemented using a hardware overhead of a few gates per scan chain.","PeriodicalId":362373,"journal":{"name":"2007 Asia and South Pacific Design Automation Conference","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122452458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Current-based Method for Short Circuit Power Calculation under Noisy Input Waveforms","authors":"H. Fatemi, Shahin Nazarian, Massoud Pedram","doi":"10.1109/ASPDAC.2007.358083","DOIUrl":"https://doi.org/10.1109/ASPDAC.2007.358083","url":null,"abstract":"An accurate model is presented to calculate the short circuit energy dissipation of logic cells. The short circuit current is highly dependent on the input and output voltage values. Therefore the actual shape of the voltage signal waveforms at the input and output of the cell should be considered in order to precisely calculate the short circuit energy dissipation. Previous approaches such as the approximation of the crosstalk induced noisy waveforms with saturated ramps can lead to short circuit energy estimation errors as high as an order of magnitude for a minimum sized inverter. To resolve this shortcoming, a current-based logic cell model is utilized, which constructs the output voltage waveform for a given noisy input waveform. The input and output voltage waveforms are then used to calculate the short circuit current, and hence, short circuit energy dissipation. A characterization process is executed for each logic cell in the standard cell library to model the relevant electrical parameters e.g., the parasitic capacitances and nonlinear current sources. Additionally, our model is capable of calculating the short circuit energy dissipation caused by glitches in VLSI circuits, which in some cases can be a key contributor to the total circuit energy dissipation. Experimental results show an average error of about 1% and a maximum error of 3% compared to SPICE for different types of logic cells under noisy input waveforms including glitches while the runtime speedup is up to a factor of 16,000.","PeriodicalId":362373,"journal":{"name":"2007 Asia and South Pacific Design Automation Conference","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125422177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"VLSI Design of Multi Standard Turbo Decoder for 3G and Beyond","authors":"Imran Ahmed, T. Arslan","doi":"10.1109/ASPDAC.2007.358050","DOIUrl":"https://doi.org/10.1109/ASPDAC.2007.358050","url":null,"abstract":"Turbo decoding architectures have greater error correcting capability than any other known code. Due to their excellent performance turbo codes have been employed in several transmission systems such as CDMA2000, WCDMA (UMTS), ADSL, IEEE 802.16 metropolitan networks etc. The computation kernel of the algorithm is very similar and we have exploited this commonality for a turbo decoder VLSI design suitable for deployment using platform based system on chip methodologies. Turbo and Viterbi components of the unified array are also individually reconfigurable for different standards. This supports the 4G concept that user can be simultaneously connected to several access technologies (for example Wi-Fi, 3G, GSM etc.) and can seamlessly move between them. A new normalization scheme for turbo decoding is presented to suit reconfigurable mappings. We have also shown dynamic reconfiguration methodology for a context switch between turbo and Viterbi decoders which does not waste any clock cycles. The reconfigurable turbo decoder fabric is implemented reusing components of Viterbi decoder on a 180 nm UMC process technology.","PeriodicalId":362373,"journal":{"name":"2007 Asia and South Pacific Design Automation Conference","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129193645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Psuedo-Millimeter-Wave Up-Conversion Mixer with On-Chip Balun for Vehicular Radar Systems","authors":"I. Lai, M. Fujishima","doi":"10.1109/ASPDAC.2007.357963","DOIUrl":"https://doi.org/10.1109/ASPDAC.2007.357963","url":null,"abstract":"A low-power, fully integrated 20-26 GHz broadband up-conversion mixer implemented with on-chip Marchand baluns is demonstrated on 90nm CMOS technology in this paper. The baluns employ capacitive coupling between two metal layers and include slotted shields to reduce substrate losses. At 22.1 GHz, the integrated mixer achieves a conversion gain of 2 dB with a maximum power dissipation of only 11.1mW from a 1.2V dc power supply at LO power of 5 dBm.","PeriodicalId":362373,"journal":{"name":"2007 Asia and South Pacific Design Automation Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122191126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"System Architecture for Software Peripherals","authors":"S. Choudhuri, T. Givargis","doi":"10.1109/ASPDAC.2007.357792","DOIUrl":"https://doi.org/10.1109/ASPDAC.2007.357792","url":null,"abstract":"Software peripherals (Lioupis et al., 2001) have been proposed as a design alternative to traditional peripherals. We propose a software architecture, design methodology and scheduling scheme for implementing software peripherals on general purpose processors, with fast context switch and high resolution timers. Our design flow automatically generates code for scheduling software peripherals. We demonstrate the feasibility of our proposed work by experimenting with a set of five software peripherals scheduled to execute on a MIPS processor. Our performance evaluations show that the performance impact of the software peripherals on user-level tasks is minimal (i.e., 10.11% on a 100 MHz processor) - strongly suggesting that with the right architecture, software peripherals can be efficiently accommodated in typical embedded applications.","PeriodicalId":362373,"journal":{"name":"2007 Asia and South Pacific Design Automation Conference","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130509580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Nakajima, Takao Yamamoto, M. Yamasaki, T. Hosoki, M. Sumita
{"title":"Low Power Techniques for Mobile Application SoCs Based on Integrated Platform \"UniPhier\"","authors":"M. Nakajima, Takao Yamamoto, M. Yamasaki, T. Hosoki, M. Sumita","doi":"10.1109/ASPDAC.2007.358060","DOIUrl":"https://doi.org/10.1109/ASPDAC.2007.358060","url":null,"abstract":"In this paper, we describe the various low power techniques for mobile application SoCs based on the integrated platform \"UniPhier\". To minimize SoC power dissipation, hierarchical approaches from UniPhier Soc level, UniPhier processor level, IPP (instruction parallel processor) level, and circuit level are adopted. As SoC level, 1) well functionally isolated 5 major units of UniPhier SoC architecture, 2) dedicated stream DMA controller which can minimize CPU load and memory access. As UniPhier processor level, 1) UniPhier processor consists of IPP with dedicated low power hardware engine, 2) VMP (virtual multi-processor) mechanism with micro sleep which can reduce average power consumption in case of multiple tasks concurrent operation, 3) intermittent operation with the combination of micro-sleep and clock/power down scheme in case of very light load operation. As IPP level, 1) sophisticated instruction fetch buffer mechanism which can reduce more than 50% memory access for instruction fetch. 2) Hierarchical and selective clock gating scheme by detailed clock power analysis and clock activity rate analysis on real application.) Optimized physical implementation with low-power library and selective use of custom macros. As circuit level, mixed body bias technique with fixed Id and fixed Vt control which can realize 85 % delay variation suppressed and 25% ED product improvement compared with the no body bias.","PeriodicalId":362373,"journal":{"name":"2007 Asia and South Pacific Design Automation Conference","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130604187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}