T. Miyakawa, Hiroshi Tsutsui, H. Ochi, Takashi Sato
{"title":"Realization of frequency-domain circuit analysis through random walk","authors":"T. Miyakawa, Hiroshi Tsutsui, H. Ochi, Takashi Sato","doi":"10.1109/ASPDAC.2013.6509591","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509591","url":null,"abstract":"This paper presents the realization of frequency-domain circuit analysis based on random walk framework for the first time. In conventional random walk based circuit analyses, the sample movement at a node is randomly chosen to follow the edge probabilities. The probabilities are determined by edge-admittances connecting to the node, which is impossible to apply for the frequency-domain analysis because the probabilities are imaginary numbers. By applying the idea of importance sampling, the intractable imaginary probabilities are converted into real numbers while maintaining the estimation correctness. Runtime acceleration through incremental analysis is also proposed.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128998312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-density integration of functional modules using monolithic 3D-IC technology","authors":"Shreepad Panth, K. Samadi, Yang Du, S. Lim","doi":"10.1109/ASPDAC.2013.6509679","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509679","url":null,"abstract":"Three dimensional integrated circuits (3D-ICs) have emerged as a promising solution to continue device scaling. They can be realized using Through Silicon Vias (TSVs), or monolithic integration using Monolithic Inter-tier vias (MIVs), an emerging alternative that provides much higher via densities. In this paper, we provide a framework for floorplanning existing 2D IP blocks into 3D-ICs using MIVs. We take the floorplanning solution all the way through place-and-route and report post-layout metrics for area, wirelength, timing, and power consumption. Results show that the wirelength of TSV-based 3D designs outperform 2D designs by upto 14% in large-scale circuits only. MIV-based 3D designs, however, offer an average wirelength improvement of 33% for a wide range of benchmark circuits. We also show that while TSV-based 3D cannot improve the performance and power unless the TSV capacitance is reduced, MIV-based 3D offers significant reduction of upto 33% in the longest path delay and 35% in the inter-block net power.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129198351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cache Capacity Aware Thread Scheduling for Irregular Memory Access on many-core GPGPUs","authors":"Hsien-Kai Kuo, Ta-Kan Yen, B. Lai, Jing-Yang Jou","doi":"10.1109/ASPDAC.2013.6509618","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509618","url":null,"abstract":"On-chip shared cache is effective to alleviate the memory bottleneck in modern many-core systems, such as GPGPUs. However, when scheduling numerous concurrent threads on a GPGPU, a cache capacity agnostic scheduling scheme could lead to severe cache contention among threads and thus significant performance degradation. Moreover, the diverse working sets in irregular applications make the cache contention issue an even more serious problem. As a result, taking cache capacity into account has become a critical scheduling issue of GPGPUs. This paper formulates a Cache Capacity Aware Thread Scheduling Problem to capture the impact of cache capacity as well as different architectural considerations. With a proof to be NP-hard, this paper has proposed two algorithms to perform the cache capacity aware thread scheduling. The simulation results on Nvidia's Fermi configuration have shown that the proposed scheduling scheme can effectively avoid cache contention, and achieve an average of 44.7% cache miss reduction and 28.5% runtime enhancement. The paper also shows the runtime can be enhanced up to 62.5% for more complex applications.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126872834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Layer minimization in escape routing for staggered-pin-array PCBs","authors":"Yuan-Kai Ho, Xin-Wei Shih, Yao-Wen Chang, Chung-Kuan Cheng","doi":"10.1109/ASPDAC.2013.6509594","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509594","url":null,"abstract":"As the technology advances, the pin number of a high-end PCB design keeps increasing. The staggered pin array is used to accommodate a larger pin number than the grid pin array of the same area. Nevertheless, escaping a large pin number to the boundary of a dense staggered pin array, namely multilayer escape routing for staggered pin arrays, is significantly harder than that for grid pin arrays. This paper addresses this multilayer escape routing problem to minimize the number of used layers in a staggered pin array for manufacturing cost reduction. We first present an escaped pin selection method to assign a maximal number of escaped pins in the current layer and also to increase useful routing regions for subsequent layers. Missing pins are also modeled in our routing network to utilize the routing resource effectively. Experimental results show that our approach can significantly reduce the required layer number for escape routing.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121057340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On potential design impacts of electromigration awareness","authors":"A. Kahng, S. Nath, T. Simunic","doi":"10.1109/ASPDAC.2013.6509650","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509650","url":null,"abstract":"Reliability issues significantly limit performance improvements from Moore's-Law scaling. At 45nm and below, electromigration (EM) is a serious reliability issue which affects global and local interconnects in a chip and limits performance scaling. Traditional IC implementation flows meet a 10-year lifetime requirement by overdesigning and sacrificing performance. At the same time, it is well-known among circuit designers that Black's Equation [2] suggests that lifetime can be traded for performance. In our work, we carefully study the impacts of EM-awareness on IC implementation outcomes, and show that circuit performance does not trade off so smoothly with mean time to failure (MTTF) as suggested by Black's Equation. We conduct two basic studies: EM lifetime versus performance with fixed resource budget, and EM lifetime versus resource with fixed performance. Using design examples implemented in two process nodes, we show that performance scaling achieved by reducing the EM lifetime requirement depends on the EM slack in the circuit, which in turn depends on factors such as timing constraints, length of critical paths and the mix of cell sizes. Depending on these factors, the performance gain can range from 10% to 80% when the lifetime requirement is reduced from 10 years to one year. We show that at a fixed performance requirement, power and area resources are affected by the timing slack and can either decrease by 3% or increase by 7.8% when the MTTF requirement is reduced. We also study how conventional EM fixes using per net Non-Default Rule (NDR) routing, downsizing of drivers, and fanout reduction affect performance at reduced lifetime requirements. Our study indicates, e.g., that NDR routing can increase performance by up to 5% but at the cost of 2% increase in area at a reduced 7-year lifetime requirement.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134053079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-mode pipelined MPSoCs for streaming applications","authors":"Haris Javaid, D. Witono, S. Parameswaran","doi":"10.1109/ASPDAC.2013.6509601","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509601","url":null,"abstract":"In this paper, we propose a design flow for the pipelined paradigm of Multi-Processor System on Chips (MPSoCs) targeting multiple streaming applications. A multi-mode pipelined MPSoC, used as a streaming accelerator, executes multiple, mutually exclusive applications through modes where each mode refers to the execution of one application. We model each application as a directed graph. The challenge is to merge application graphs into a single graph so that the multi-mode pipelined MPSoC derived from the merged graph contains minimal resources. We solve this problem by finding maximal overlap between application graphs. Three heuristics are proposed where two of them greedily merge application graphs while the third one finds an optimal merging at the cost of higher running time. The results indicate significant area saving (up to 62% processor area, 57% FIFO area and 44 processor/FIFO ports) with minuscule degradation of system throughput (up to 2%) and latency (up to 2%) and increase in energy values (up to 3%) when compared to widely used approach of designing distinct pipelined MPSoCs for individual applications. Our work is the first step in the direction of multi-mode pipelined MPSoCs, and the results demonstrate the usefulness of resource sharing among pipelined MPSoCs based streaming accelerators in a multimedia platform.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134527197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kai-Han Tseng, Sheng-Chi You, W. H. Minhass, Tsung-Yi Ho, P. Pop
{"title":"A network-flow based valve-switching aware binding algorithm for flow-based microfluidic biochips","authors":"Kai-Han Tseng, Sheng-Chi You, W. H. Minhass, Tsung-Yi Ho, P. Pop","doi":"10.1109/ASPDAC.2013.6509598","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509598","url":null,"abstract":"Designs of flow-based microfluidic biochips are receiving much attention recently because they replace conventional biological automation paradigm and are able to integrate different biochemical analysis functions on a chip. However, as the design complexity increases, a flow-based microfluidic biochip needs more chip-integrated micro-valves, i.e., the basic unit of fluid-handling functionality, to manipulate the fluid flow for biochemical applications. Moreover, frequent switching of micro-valves results in decreased reliability. To minimize the valve-switching activities, we develop a network-flow based resource binding algorithm based on breadth-first search (BFS) and minimum cost maximum flow (MCMF) in architectural-level synthesis. The experimental results show that our methodology not only makes significant reduction of valve-switching activities but also diminishes the application completion time for both real-life applications and a set of synthetic benchmarks.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134545471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Loadsa: A yield-driven top-down design method for STT-RAM array","authors":"Wujie Wen, Yaojun Zhang, Lu Zhang, Yiran Chen","doi":"10.1109/ASPDAC.2013.6509611","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509611","url":null,"abstract":"As an emerging nonvolatile memory technology, spin-transfer torque random access memory (STT-RAM) faces great design challenges. The large device variations and the thermal-induced switching randomness of the magnetic tunneling junction (MTJ) introduce the persistent and non-persistent errors in STT-RAM operations, respectively. Modeling these statistical metrics generally require the expensive Monte-Carlo simulations on the combined magnetic-CMOS models, which is hardly integrated in the modern micro-architecture and system designs. Also, the conventional bottom-up design method incurs costly iterations in the STT-RAM design toward specific system requirement. In this work, we propose Loadsa1: a yield-driven top-down design method to explore the design space of STT-RAM array from a statistical point of view. Both array-level semi-analytical yield model and cell-level failure-probability model are developed to enable a top-down design method: The system-level requirements, e.g., the chip yield under power and area constraints, are hierarchically mapped to array-and cell-level design parameters, e.g., redundancy, ECC scheme, and MOS transistor size, etc. Our simulation results show that Loadsa can accurately optimize the STT-RAM based on the system and cell-level constraints with a linear computation complexity. Our method demonstrates great potentials in the early design stage of memory or micro-architecture by eliminating the design integrations, while offering a full statistical view of the design even when the common yield enhancement practices are applied.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129383959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal partition with block-level parallelization in C-to-RTL synthesis for streaming applications","authors":"Shuangchen Li, Yongpan Liu, X. Hu, Xinyu He, Yining Zhang, Pei Zhang, Huazhong Yang","doi":"10.1109/ASPDAC.2013.6509600","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509600","url":null,"abstract":"Developing FPGA solutions for streaming applications written in C (or its variants) can benefit greatly from automatic C-to-RTL (C2RTL) synthesis. Yet, the complexity and stringent throughput/cost constraints of such applications are rather challenging for existing C2RTL synthesis tools. This paper considers automatic partition and block-level parallelization to address these challenges. An MILP-based approach is introduced for finding an optimal partition of a given program into blocks while allowing block-level parallelization. In order to handle extremely large problem instances, a heuristic algorithm is also discussed. Experimental results based on seven well known multimedia applications demonstrate the effectiveness of both solutions.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116851251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ting-Shuo Chou, T. Givargis, Chen-Chun Huang, Bailey Miller, F. Vahid
{"title":"An efficient compression scheme for checkpointing of FPGA-based digital mockups","authors":"Ting-Shuo Chou, T. Givargis, Chen-Chun Huang, Bailey Miller, F. Vahid","doi":"10.1109/ASPDAC.2013.6509669","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509669","url":null,"abstract":"This paper outlines a transparent and nonintrusive checkpointing mechanism for use with FPGA-based digital mockups. A digital mockup is an executable model of a physical system and used for real-time test and validation of cyber-physical devices that interact with the physical system. These digital mockups are typically defined in terms of a large set of ordinary differential equations. We consider digital mockups impelemented on field-programmable gate arrays (FPGAs). A checkpoint is a snapshot of the internal state of the model at a specific point in time as captured by some controller that resides on the same FPGA. We require that the model continues uninterrupted execution during a checkpointing operation. Once a checkpoint is created, the corresponding state information is transferred from the FPGA to a host computer for visualization and other off-chip processing. We outline the architecture of a checkpointing controller that captures and transfers the state information at a desired clock cycle using an aggressive compression technique. Our compression technique achieves 90% reduction in data transferred from the FPGA to the host computer under periodic checkpointing scenarios. The checkpointing with compression yields 15-36% FPGA size overhead, versus 6-11% for checkpointing without compression.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114580027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}