{"title":"Scheduling with integer time budgeting for low-power optimization","authors":"Wei Jiang, Zhiru Zhang, M. Potkonjak, J. Cong","doi":"10.1109/ASPDAC.2008.4483947","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483947","url":null,"abstract":"In this paper we present a mathematical programming formulation of the integer time budgeting problem for directed acyclic graphs. In particular, we formally prove that our constraint matrix has a special property that enables a polynomial-time algorithm to solve the problem optimally with a guaranteed integral solution. Our theory can be directly applied to solving a scheduling problem in behavioral synthesis with the objective of minimizing the system power consumption. Given a set of scheduling constraints and a collection of convex power-delay tradeoff curves for each type of operation, our scheduler can intelligently schedule the operations to appropriate clock cycles and simultaneously select the module implementations that lead to low-power solutions. Experiments demonstrate that our proposed technique can produce near-optimal results (within 6% of the optimum by the ILP formulation), with 40x+ speedup.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128577172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new low energy BIST using a statistical code","authors":"S. Chun, Taejin Kim, Sungho Kang","doi":"10.1109/ASPDAC.2008.4484031","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484031","url":null,"abstract":"To tackle with the increased switching activity during the test operation, this paper proposes a new built-in self test (BIST) scheme for low energy testing that uses a statistical code and a new technique to skip unnecessary test sequences. From a general point of view, the goal of this technique is to minimize the total power consumption during a test and to allow the at-speed test in order to achieve high fault coverage. The effectiveness of the proposed low energy BIST scheme was validated on a set of ISC AS '89 benchmark circuits with respect to test data volume and energy saving.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122978681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Super-K: A SoC for single-chip ultra mobile computer","authors":"Xu Cheng","doi":"10.1109/ASPDAC.2008.4483957","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483957","url":null,"abstract":"In this panel discussion, I will make a brief introduction to the on-going Super-K project at the Microprocessor Research and Development Center (MPRC). The background of this project is the convergence of computers, consumer electronics and communication, so called 3C. MPRC has a ten-year history in microprocessor and system-on-chip (SoC) research and design. In the past, we have developed our own 32-bit RISC processor named UniCore, and two generations of SoC named PKUnity-863 (in acknowledgement of the support from China National High-Tech Program 863, abbr. 863 Project). From 2003 onwards, network computers based on UniCore have been used in commercial context.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124161944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Carloni, A. Kahng, S. Muddu, A. Pinto, K. Samadi, P. Sharma
{"title":"Interconnect modeling for improved system-level design optimization","authors":"L. Carloni, A. Kahng, S. Muddu, A. Pinto, K. Samadi, P. Sharma","doi":"10.1109/ASPDAC.2008.4483952","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483952","url":null,"abstract":"Accurate modeling of delay, power, and area of interconnections early in the design phase is crucial for effective system-level optimization. Models presently used in system-level optimizations, such as network-on-chip (NoC) synthesis, are inaccurate in the presence of deep-submicron effects. In this paper, we propose new, highly accurate models for delay and power in buffered interconnects; these models are usable by system-level designers for existing and future technologies. We present a general and transferable methodology to construct our models from a wide variety of reliable sources (Liberty, LEF/ITF, ITRS, PTM, etc.). The modeling infrastructure, and a number of characterized technologies, are available as open source. Our models comprehend key interconnect circuit and layout design styles, and a power-efficient buffering technique that overcomes unrealities of previous delay-driven buffering techniques. We show that our models are significantly more accurate than previous models for global and intermediate buffered interconnects in 90 nm and 65 nm foundry processes - essentially matching signoff analyses. We also integrate our models in the COSI-OCC synthesis tool and show that the more accurate modeling significantly affects optimal/achievable architectures that are synthesized by the tool. The increased accuracy provided by our models enables system-level designers to obtain better assessments of the achievable performance/power/area tradeoffs for (communication-centric aspects of) system design, with negligible setup and overhead burdens.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127057266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient synthesis of compressor trees on FPGAs","authors":"H. Parandeh-Afshar, P. Brisk, P. Ienne","doi":"10.1109/ASPDAC.2008.4483927","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483927","url":null,"abstract":"FPGA performance is currently lacking for arithmetic circuits. Large sums of k > 2 integer values is a computationally intensive operation in applications such as digital signal and video processing. In ASIC design, compressor trees, such as Wallace and Dadda trees, are used for parallel accumulation; however, the LUT structure and fast carry-chains employed by modern FPGAs favor trees of carry-propagate adders (CPAs), which are a poor choice for ASIC design. This paper presents the first method to successfully synthesize compressor trees on LUT-based FPGAs. In particular, we have found that generalized parallel counters (GPCs) map quite well to LUTs on FPGAs; a heuristic, presented within, constructs a compressor tree from a library of GPCs that can efficiently be implemented on the target FPGA. Compared to the ternary adder trees produced by commercial synthesis tools, our heuristic reduces the combinational delay by 27.5%, on average, within a tolerable average area increase of 5.7%.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131266357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reaching the limits of low power design","authors":"J. Hobbs, T. Williams","doi":"10.1109/ASPDAC.2008.4484048","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484048","url":null,"abstract":"As process technologies continue to shrink, and feature demands continue to increase, more and more capabilities are being pushed into smaller and smaller packages. But are we finally reaching the point where power density limitations make this trend no longer sustainable? What advanced techniques are in use today, and on the horizon, to address this? Are we limited only to hardware techniques, or can these power limitation issues be addressed with smarter software development? And how do we handle verification of these complex implementations? This paper explores possible methods for improving the \";power capacity\"; of power sensitive designs.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132860475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Block remap with turnoff: A variation-tolerant cache design technique","authors":"M. A. Hussain, M. Mutyam","doi":"10.1109/ASPDAC.2008.4484058","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4484058","url":null,"abstract":"With reducing feature size, the effects of process variations are becoming more and more predominant. Memory components such as on-chip caches are more susceptible to such variations because of high density and small sized transistors present in them. Process variations can result in high access latency and leakage energy dissipation. This may lead to a functionally correct chip being rejected, resulting in reduced chip yield. In this paper, by considering a process variation affected on-chip data cache, we first analyze performance loss due to worst-case design techniques such as accessing the entire cache with the worst-case access latency or turning off the process variation affected cache blocks, and show that the worst-case design techniques result in significant performance loss and/or high leakage energy. Then by exploiting the fact that not all applications require full associativity at set-level, we propose a variation-tolerant design technique, namely, block remap with turnoff (BRT), to minimize performance loss and leakage energy consumption. In BRT technique we selectively turnoff few blocks after rearranging them in such a way that all sets get almost equal number of process variation affected blocks. By turning off process variation affected blocks of a set, leakage energy can be minimized and the set can be accessed with low latency at the cost of reduced set associativity. We validate our technique by running SPEC2000 CPU benchmark-suite on Simplescalar simulator and show that our technique significantly reduces the performance loss and leakage energy consumption due to process variations.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124458503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Routability driven modification method of monotonic via assignment for 2-layer Ball Grid Array packages","authors":"Yoichi Tomioka, A. Takahashi","doi":"10.1109/ASPDAC.2008.4483949","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483949","url":null,"abstract":"Ball grid array packages in which I/O pins are arranged in a grid array pattern realize a number of connections between chips and a printed circuit board, but it takes much time in manual routing. We propose a fast routing method for 2-layer ball grid array packages to support designers. Our method distributes wires evenly on top layer and increases completion ratio of nets by improving via assignment iteratively.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116415356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A brand new wireless day","authors":"J. Rabaey","doi":"10.1109/ASPDAC.2008.4483940","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483940","url":null,"abstract":"Summary form only given. The wireless communications field has experienced a truly amazing growth since the early 1990's. Wireless connectivity slowly but surely has become pervasive. One would expect that by now this revolution must be losing some steam, but the truth is far from that. If anything, it is gathering even more speed. In the coming decades, introduction of innovative wireless technologies will enable a broad range of exciting applications to come to fruition, and reshape the way we interact with our daily living environment. Underlying it all is a three-tiered environment consisting of a large number of huge data and compute centers, billions of mobile compute and computation devices, and potentially trillions of tiny sensors and actuators. Making this happen will require some important wireless roadblocks to be either overcome or circumvented. A short list of those includes spectrum scarcity, reliability, complexity, security and obviously power. In this presentation, a number of innovative and even revolutionary solutions to address these will be discussed. Examples are collaborative cognitive networks, wireless in the mm-wave region of the spectrum, and miniature wireless. Each of these approaches pushes some part of the design technology to its limits, and may even require a totally novel approach towards design, all this while semiconductor technology is trying to cope with the uncertainty of design in the nanometer regime. One thing is for sure - the wireless designer of the next decade is bound for some very exciting times.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"565 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116449324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parameterized embedded in-circuit emulator and its retargetable debugging software for microprocessor/microcontroller/DSP processor","authors":"Liang-Bi Chen, Yung-Chih Liu, Chen-Hung Chen, Chung-Fu Kao, Ing-Jer Huang","doi":"10.1109/ASPDAC.2008.4483923","DOIUrl":"https://doi.org/10.1109/ASPDAC.2008.4483923","url":null,"abstract":"The in-circuit emulator (ICE) is commonly adopted as a microprocessor debugging technique. In this paper, a parameterized embedded in-circuit emulator and its retargetable debugging software are proposed. The parameterized embedded in-circuit emulator can be integrated into different style processors such as microcontroller, microprocessor, and DSP processor. The GUI interface debugging software can help user to debug easily. As a result of it, the duration of microprocessor debugging design procedure time is reduced.","PeriodicalId":277556,"journal":{"name":"2008 Asia and South Pacific Design Automation Conference","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122981727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}