{"title":"Synthesis and design of parameter extractors for low-power pre-computation-based content-addressable memory using gate-block selection algorithm","authors":"Jui-Yuan Hsieh, S. Ruan","doi":"10.5555/1356802.1356884","DOIUrl":"https://doi.org/10.5555/1356802.1356884","url":null,"abstract":"Content addressable memory (CAM) is frequently used in applications, such as lookup tables, databases, associative computing, and networking, that require high-speed searches due to its ability to improve application performance by using parallel comparison to reduce search time. Although the use of parallel comparison results in fast search time, it also significantly increases power consumption. In this paper, we propose a gate- block selection algorithm, which can synthesize a proper parameter extractor of the pre-computation-based CAM (PB-CAM) to improve the efficiency for specific applications such as embedded systems. Through experimental results, we found that our approach effectively reduces the number of comparison operations for specific data types (ranging from 19.24% to 27.42%) compared with the 1's count approach. We used Synopsys Nanosim to estimate the power consumption in TSMC 0.35 um CMOS process. Compared to the 1's count PB-CAM, our proposed PB-CAM achieves 17.72% to 21.09% in power reduction for specific data types.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134241492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Temperature-aware MPSoC scheduling for reducing hot spots and gradients","authors":"A. Coskun, T. Simunic, K. Whisnant, K. Gross","doi":"10.1109/ASPDAC.2008.4484002","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484002","url":null,"abstract":"Thermal hot spots and temperature gradients on the die need to be minimized to manufacture reliable systems while meeting energy and performance constraints. In this work, we solve the task scheduling problem for multiprocessor system-on-chips (MPSoCs) using Integer Linear Programming (ILP). The goal of our optimization is minimizing the hot spots and balancing the temperature distribution on the die for a known set of tasks. Under the given assumptions about task characteristics, the solution is optimal. We compare our technique against optimal scheduling methods for energy minimization, energy balancing, and hot spot minimization, and show that our technique achieves significantly better thermal profiles. We also extend our technique to handle workload variations at runtime.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131414517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Faster projection based methods for circuit level verification","authors":"Chao Yan, M. Greenstreet","doi":"10.1109/ASPDAC.2008.4483985","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483985","url":null,"abstract":"As VLSI fabrication technology progresses to 65 nm feature sizes and smaller, transistors no longer operate as ideal switches. This motivates the verification of digital circuits using continuous models. Recently, we showed how such verification can be performed using projection based methods.However, the verification was slow, requiring nearly four CPU days to verify a nine-transistor toggle flip-flop. Here, we describe improvements to the reachability algorithms and optimizations of the software architecture. These produce a 15 x reduction in computation time and significant reductions in the overapproximation errors. With these changes, the same toggle flip-flop can be verified in a few hours, making formal verification a viable alternative to circuit simulation.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115683177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An efficient performance improvement method utilizing specialized functional units in Behavioral Synthesis","authors":"Tsuyoshi Sadakata, Y. Matsunaga","doi":"10.1109/ASPDAC.2008.4483969","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483969","url":null,"abstract":"This paper proposes a novel behavioral synthesis method that improves performance of synthesized circuits utilizing specialized functional units effectively. Specialized functional units (e.g. multiply-accumulator) are designed for specific operation patterns to achieve shorter delay and/or smaller area than cascaded basic functional units. Almost all conventional methods cannot use specialized functional units effectively under a total area constraint because of their less flexibility for resource sharing. The proposed method makes it possible to solve module selection, scheduling, and functional unit allocation problems utilizing specialized functional units in practical time with some heuristics, and to reduce the number of clock cycles under total area and clock cycle time constraints. Experimental results show that the proposed method has achieved up to 35% and on average 14% reduction of the number of cycles with specialized functional units in practical time.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115699576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distribution arithmetic for stochastical analysis","authors":"M. Olbrich, E. Barke","doi":"10.1109/ASPDAC.2008.4484009","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484009","url":null,"abstract":"This paper presents a novel arithmetic which allows calculations with fluctuating values. The arithmetic consists of a special representation of random variables and procedures for performing numerical operations between them. Given the distributions of initial random variables, the moments (such as expected value, variance and higher moments) of any calculated variable can be determined. Our approach is not limited to normal distributions and works with linear and nonlinear functions. Correlations between variables are taken into account automatically by the arithmetic. Examples show the accuracy and runtimes compared to Monte Carlo simulation.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115921226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Load scheduling: Reducing pressure on distributed register files for free","authors":"M. Wen, N. Wu, Maolin Guan, Chunyuan Zhang","doi":"10.1109/ASPDAC.2008.4483971","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483971","url":null,"abstract":"In this paper we describe load scheduling, a novel method that balances load among register files by residual resources. Load scheduling can reduce register pressure for clustered VLIW processors with distributed register files while not increasing VLIW scheduling length. We have implemented load scheduling in compiler for Imagine and FT64 stream processors. The result shows that the proposed technique effectively reduces the number of variables spilled to memory, and can even eliminate it. The algorithm presented in this paper is extremely efficient in embedded processor with limited register resource because it can improve registers utilization instead of increasing the requirement for the number of registers.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114956840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anupama R. Subramaniam, Ritu Singhal, Chi-Chao Wang, Yu Cao
{"title":"Design rule optimization of regular layout for leakage reduction in nanoscale design","authors":"Anupama R. Subramaniam, Ritu Singhal, Chi-Chao Wang, Yu Cao","doi":"10.1109/ASPDAC.2008.4483997","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483997","url":null,"abstract":"The effect of non-rectilinear gate (NRG) due to sub-wavelength lithograph dramatically increases the leakage current by more than 15X. To mitigate this penalty, we have developed a systematic procedure to optimize key layout parameters in regular layout with minimum area and speed overhead. As demonstrated in 65 nm technology, the optimization of regular layout achieves more than 70% reduction in leakage under NRG, with area penalty of ~10% and marginal impact on circuit speed and active power.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124090417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Collaborative hardware/software partition of coarse-grained reconfigurable system using evolutionary ant colony optimization","authors":"Dawei Wang, Sikun Li, Y. Dou","doi":"10.1109/ASPDAC.2008.4484037","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484037","url":null,"abstract":"The flexibility, performance and cost effectiveness of reconfigurable architectures have lead to its widespread use for embedded applications. Coarse-grained reconfigurable system design is very complex for multi-fields experts to collaborate on application algorithm design, hardware/software co-design and system decision. However, existing reconfigurable system design methods and environments can only support hardware/software co-design, ignoring the collaboration between multi-field experts. This paper presents a collaborative partition approach of coarse-grained reconfigurable system design using evolutionary ant colony optimization. We create a distributed collaborative design environment for system decision engineers, software designers, hardware designers and application algorithm developers. The method not only utilizes the advantages of ant colony optimization for searching global optimal solutions, but also provides a framework for multi-field experts to work collaboratively. Experimental results show that the method improves the quality and speed of hardware/software partition for coarse-grained reconfigurable system design.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125934034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A design- for-diagnosis technique for diagnosing both scan chain faults and combinational circuit faults","authors":"Fei Wang, Yu Hu, Huawei Li, Xiaowei Li","doi":"10.1109/ASPDAC.2008.4484017","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484017","url":null,"abstract":"The amount of die area consumed by scan chains and scan control circuit can range from 15%~30%, and scan chain failures account for almost 50% of chip failures. As the conventional diagnosis process usually runs on the faulty free scan chain, scan chain faults may disable the diagnostic process, leaving large failure area to time-consuming failure analysis. In this paper, a design-for-diagnosis (DFD) technique is proposed to diagnose faulty scan chains precisely and efficiently, moreover, with the assistant of the proposed technique, the conventional logic diagnostic process can be carried on with faulty scan chains. The proposed approach is entirely compatible with conventional scan-based design. Previously proposed software-based diagnostic methods for conventional scan designs can still be applied to our design. Experiments on ISCAS'89 benchmark circuits are conducted to demonstrate the efficiency of the proposed DFD technique.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127531375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analytical model for the impact of multiple input switching noise on timing","authors":"Rajeshwary Tayade, S. Nassif, J. Abraham","doi":"10.1109/ASPDAC.2008.4484005","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484005","url":null,"abstract":"The timing models used in current Static Timing Analysis tools use gate delays only for single input switching events. It is well known that the temporal proximity of signals arriving at different inputs causes significant variation in the gate delay. This variation in delay affects the accuracy of our timing estimates. In this paper, we derive simple analytical models for incorporating the effect of simultaneous multiple input switching events on gate delay. The model presented requires minimum additional characterization effort, and can be employed in a statistical timing engine. The dynamic delay variability of a path caused by MIS noise can be accurately estimated using the proposed model.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127331682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}