Nobuhiro Doi, T. Horiyama, M. Nakanishi, S. Kimura
{"title":"Minimization of fractional wordlength on fixed-point conversion for high-level synthesis","authors":"Nobuhiro Doi, T. Horiyama, M. Nakanishi, S. Kimura","doi":"10.1109/ASPDAC.2004.1337544","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337544","url":null,"abstract":"In the hardware synthesis from high-level language such as C, bit length of variables is one of the key issues on the area and speed optimization. Usually, designers are required to specify the word length of each variable manually, and verify the correctness by the simulation on huge data. We propose an optimization method of fractional word length of floating-point variables in the floating to fixed-point conversion of variables. The amount of round-off errors are formulated with parameters and propagated via data flow graphs. The nonlinear programming is used to solve the fractional word length minimization problem. The method does not require the simulation on huge data, and is very fast compared to ones based on the simulation. We have shown the effect on several programs.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114699924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Complexity analysis and speedup techniques for optimal buffer insertion with minimum cost","authors":"Weiping Shi, Zhuo Li, C. Alpert","doi":"10.1109/ASPDAC.2004.1337664","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337664","url":null,"abstract":"As gate delays d e e m faster than wire delays for each teehnolagy generation, buffer insertion hecomes a popular method to reduce the interconnecI delay. Several modem huffer insertion algorithms (e.g.. 17.6.151) are based on van Ginneken¿s dynamic programming paradigm [141. However, van Ginneken¿s original algorithm does not control buffering resources and tends to over-buffering, thereby wasting area and power. It has been a major open prohlem whether it is possible to optimize slack and at the same time minimize the buffer usage. This paper settles this open problem by showing that for arbitrary integer cost functions, the problem is NP-complete. We also extend the prr-buffer slack technique (121 to minimize the buffer cost. This technique can significantly reduce the running time and memory in buffer cost miniminition problem. The experimental results show that our algorithm can speed up the running time up to 17 times and reduces the memory to 1/30 of traditional best know algorithm. Finally, we show how to efficiently deal with multiway merge in buffer insertion.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134377682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Associative memory with fully parallel nearest-manhattan-distance search for low-power real-time single-chip applications","authors":"Yuji Yano, T. Koide, H. Mattausch","doi":"10.1109/ASPDAC.2004.1337640","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337640","url":null,"abstract":"A fully-paralled minimum Manhattan-distance search associative memory has been designed in 0.35μm CMOS with 3-metal layers. The nearest-match unit consumes only 1.02mm2, while the chip area is 7.49mm2. The measured winner-search time of this chip, the time to determine the best-matching reference-data word for an input-data word among a database of 128 reference words (5-bit, 16 units), is < 180nsec. This corresponds to a performance requirement of 16 GOPS/mm2, if a 32-bit computer with the same chip area would have to run the same workload. Furthermore the power dissipation of the designed test chip is only about 26.7mW/mm2.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133208403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analytical expressions for phase noise eigenfunctions of LC oscillators","authors":"P. Ghanta, Zheng Li, J. Roychowdhury","doi":"10.1109/ASPDAC.2004.1337561","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337561","url":null,"abstract":"We obtain analytical expressions for eigenfunctions that characterize the phase noise performance of generic LC oscillator structures. Using these, we also obtain analytical expressions for the timing jitter and spectrum of such oscillators. Our approach is based on identifying three fundamental parameters, derived from the oscillator's steady state, that characterize these eigenfunctions. Our analysis accounts for the nonlinear mechanism that stabilizes oscillator amplitudes. It also lays out, quantitatively and in analytical form, how symmetry in an LC oscillator's negative resistance mechanism impacts the oscillator's eigenfunctions and its phase noise/jitter characteristics. We show that symmetry results in particularly simple forms for the PPV and resultant phase noise. We compare our expressions with existing LC oscillator design formulae and show that the expressions match for symmetric nonlinearities. We validate our analytical results against simulation on practical CMOS LC oscillator circuits. Our expressions and symmetry results are expected to be useful tools for optimizing phase noise performance during the design of LC oscillators.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130567573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LPRAM: a low power DRAM with testability","authors":"S. Bhattacharjee, D. Pradhan","doi":"10.5555/1015090.1015188","DOIUrl":"https://doi.org/10.5555/1015090.1015188","url":null,"abstract":"To date all the proposal for low power designs of RAMs essentially focus on circuit level solutions. What we propose here is a novel architecture level solution. Our methodology provides a systematic trade off between power and area. Also, it allows tradeoff between test time and power consumed in test mode. Significantly, too, the proposed design has the potential to achieve performance improvements while reducing power. In this respect it stands apart from other approaches where the conventional wisdom of reducing power reduces speed.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122079491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Bhattacharya, N. Jangkrajarng, R. Hartono, C. Shi
{"title":"Hierarchical extraction and verification of symmetry constraints for analog layout automation","authors":"S. Bhattacharya, N. Jangkrajarng, R. Hartono, C. Shi","doi":"10.1109/ASPDAC.2004.1337608","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337608","url":null,"abstract":"Device matching and layout symmetry are of utmost importance to high performance analog and RF circuits. Here, we present HiLSD, the first CAD tool for the automatic detection of layout symmetry between two or more devices in a hierarchical manner. HiLSD first extracts the circuit structure from the layout, then applies an efficient pattern-matching algorithm to find all the subcircuits automatically, and finally detects layout symmetry on the portion of the layout that corresponds to extracted subcircuit instances. On a set of practical analog layouts, HiLSD is demonstrated to be much more efficient than direct symmetry detection on a flattened layout. Results from applying HiLSD to automatic analog layout retargeting for technology migration and new specifications are also described.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123417344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A thread partitioning algorithm in low power high-level synthesis","authors":"J. Uchida, N. Togawa, M. Yanagisawa, T. Ohtsuki","doi":"10.1109/ASPDAC.2004.1337543","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337543","url":null,"abstract":"We propose a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks (threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two subthreads, one of which has RF and the other does not have RF. The partitioned subthreads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned subthreads have waiting time for synchronization, gated clocks can be applied to each subthread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"2005 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123765897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predictable design of low power systems by pre-implementation estimation and optimization","authors":"W. Nebel","doi":"10.1109/ASPDAC.2004.1337531","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337531","url":null,"abstract":"Each year tens of billions of Dollars are wasted by the microelectronics industry because of missed deadlines and delayed design projects. These delays are partially due to design iterations many of which could have been avoided if the low level ramifications of high level design decisions, at the architecture- and algorithmic-level would have been known before the time consuming and tedious RT- and lower level implementation started. In this contribution we present a system-level design flow and respective EDA support tools for low power designs. We analyze the requirements for such a design technology, which shifts more responsibility to the system architect. We exemplify this approach with a design flow for low power systems. The architecture of an algorithm-level power estimation tool is presented together with some use cases based on an EDA product which has been commercially developed from the research results of several collaborative projects funded by the Commission of the European Community.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127134480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Temperature-aware global placement","authors":"B. Obermeier, F. Johannes","doi":"10.1109/ASPDAC.2004.1337555","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337555","url":null,"abstract":"We describe a deterministic placement method for standard cells which minimizes total power consumption and leads to a smooth temperature distribution over the die. It is based on the quadratic placement formulation, where the overall weighted net length is minimized. Two innovations are introduced to achieve the above goals. First, overall power consumption is minimized by shortening nets with a high power dissipation. Second, cells are spread over the placement area such that the die temperature profile inside the package is flattened. Experimental results show a significant reduction of the maximum temperature on the die and a reduction of total power consumption.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128048552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Maxiaguine, S. Künzli, S. Chakraborty, L. Thiele
{"title":"Rate analysis for streaming applications with on-chip buffer constraints","authors":"A. Maxiaguine, S. Künzli, S. Chakraborty, L. Thiele","doi":"10.1109/ASPDAC.2004.1337553","DOIUrl":"https://doi.org/10.1109/ASPDAC.2004.1337553","url":null,"abstract":"While mapping a streaming (such as multimedia or network packet processing) application onto a specified architecture, an important issue is to determine the input stream rates that can be supported by the architecture for any given mapping. This is subject to typical constraints such as on-chip buffers should not overflow, and specified play out buffers (which feed audio or video devices) should not underflow, so that the quality of the audio/video output is maintained. The main difficulty in this problem arises from the high variability in execution times of stream processing algorithms, coupled with the bursty nature of the streams to be processed. We present a mathematical framework for such a rate analysis for streaming applications, and illustrate its feasibility through a detailed case study of a MPEG-2 decoder application. When integrated into a tool for automated design-space exploration, such an analysis can be used for fast performance evaluation of different stream processing architectures.","PeriodicalId":426349,"journal":{"name":"ASP-DAC 2004: Asia and South Pacific Design Automation Conference 2004 (IEEE Cat. No.04EX753)","volume":"158 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126205712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}