Seitaro Kawai, R. Minami, Ahmed Musa, Takahiro Sato, Ning Li, Tatsuya Yamaguchi, Y. Takeuchi, Yuki Tsukui, K. Okada, A. Matsuzawa
{"title":"A full 4-channel 60 GHz direct-conversion transceiver","authors":"Seitaro Kawai, R. Minami, Ahmed Musa, Takahiro Sato, Ning Li, Tatsuya Yamaguchi, Y. Takeuchi, Yuki Tsukui, K. Okada, A. Matsuzawa","doi":"10.1109/ASPDAC.2013.6509573","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509573","url":null,"abstract":"This paper presents a 60-GHz direct-conversion transceiver in 65 nm CMOS technology. By the proposed gain peaking technique, this transceiver realizes good gain flatness and is capable of more than 7 Gbps in 16QAM wireless communication for all channels of IEEE802.11ad standard within EVM of around -23 dB. The transceiver consumes 319mWin transmitting and 223mW in receiving, including the PLL consumption.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117316791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad H. Foroozannejad, Brent Bohnenstiehl, S. Ghiasi
{"title":"BAMSE: A balanced mapping space exploration algorithm for GALS-based manycore platforms","authors":"Mohammad H. Foroozannejad, Brent Bohnenstiehl, S. Ghiasi","doi":"10.1109/ASPDAC.2013.6509642","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509642","url":null,"abstract":"We study the problem of mapping concurrent tasks of an application modeled as a data flow graph onto processors of a GALS-based manycore platform. We propose a mapping algorithm called BAMSE, which exploits the characteristics of streaming applications and the specifications of the target architecture to optimize the mapping solution. Different configuration parameters embedded into the algorithm enable one to strike a balance between scalability of the approach and the quality of generated solutions. Experiments with several real life applications show that our algorithm outperforms hand-optimized manual mappings up to 65% in terms of longest inter-processor communication link, and as high as 19% with respect to total length of the links, when the two criteria are used as primary and secondary optimization objectives, respectively. Additionally, our algorithm delivers superior mappings compared to ILP generated solutions after 10 days of solver runtime.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124941310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Okumura, S. Yoshimoto, H. Kawaguchi, M. Yoshimoto
{"title":"A physical unclonable function chip exploiting load transistors' variation in SRAM bitcells","authors":"S. Okumura, S. Yoshimoto, H. Kawaguchi, M. Yoshimoto","doi":"10.1109/ASPDAC.2013.6509565","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509565","url":null,"abstract":"We propose a chip identification (ID) generating scheme with random variation of transistor characteristics in SRAM bitcells. In the proposed scheme, a unique fingerprint is generated by grounding both bitlines. It has high speed, and it can be implemented in a very small area overhead. We fabricated test chips in a 65-nm process and obtained 12,288 sets of unique 128-bit fingerprints, which are evaluated in this paper. The failure rate of the IDs is found to be 2.1 × 10-12.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123161491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Symmetrical buffered clock-tree synthesis with supply-voltage alignment","authors":"Xin-Wei Shih, Tzu-Hsuan Hsu, Hsu-Chieh Lee, Yao-Wen Chang, Kai-Yuan Chao","doi":"10.1109/ASPDAC.2013.6509637","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509637","url":null,"abstract":"For high-performance synchronous systems, non-uniform/non-ideal supply voltages of buffers (e.g., due to IRdrop) may incur a large clock skew and thus serious performance degradation. This paper addresses this problem and presents the first symmetrical buffered clock-tree synthesis flow that considers supply voltage differences of buffers. We employ a two-phase technique of bottom-up clock sink clustering to determine the tree topology, followed by top-down buffer placement and wire routing to complete the clock tree. At each level of processing, clock skew and wirelength are minimized by the determination of buffer embedding regions and the alignment of buffer supply voltages. Experimental results show that our method can reach, on average, respective 76% and 40% clock skew reduction compared to the state-of-the-art work (1) without supply voltage consideration and (2) with an extension for supply voltages based on our top-down flow. The reduction is achieved by marginal resource and runtime overheads. Note that our method can meet the stringent skew constraint set by the 2010 ISPD contest for all cases, while other counterparts cannot. In particular, our work provides a key insight into the importance of handling practical design issues (such as IR-drop) for real-world clock-tree synthesis.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129456706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dependable VLSI Platform using Robust Fabrics","authors":"H. Onodera","doi":"10.1109/ASPDAC.2013.6509583","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509583","url":null,"abstract":"Technology scaling and growing complexity have an increasing impact on the resilience of VLSI circuits and systems. Severe challenges have been emerging for the realization of dependable VLSI circuits and systems with necessary and sufficient amount of reliability and security. For coping with the increasing threats on manufacturability, variability, and transient (soft) errors, we have been working on the development of “Dependable VLSI Platform using Robust Fabrics.” The project tackles the challenges with collaborative researches on layout, circuit, architecture, and design automation. Overview of the project as well as key achievements on the component-level (Fabrics) and the architecture-level (reconfigurable architecture) will be explained, followed by a brief introduction of the platform SoC and its C-based design tools.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129901488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pulsed-latch ASIC synthesis in industrial design flow","authors":"Sangmin Kim, Duckhwan Kim, Youngsoo Shin","doi":"10.1109/ASPDAC.2013.6509621","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509621","url":null,"abstract":"Flip-flop has long been used as a sequencing element of choice in ASIC design; commercial synthesis tools have also been developed in this context. This work has been motivated by a question of whether existing CAD tools can be employed from RTL to layout while pulsed latch replaces flip-flop as a sequencing element. Two important problems have been identified and their solutions are proposed: placement of pulse generators and latches for integrity of pulse shape, and design of special scan latches and their selective use to reduce hold violations. A reference design flow has also been set up using published documents, in order to assess the proposed one. In 40-nm technology, the proposed flow achieves 20% reduction in circuit area and 30% reduction in power consumption, on average of 12 test circuits.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127751088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimizing routability in large-scale mixed-size placement","authors":"J. Cong, Guojie Luo, Kalliopi Tsota, Bingjun Xiao","doi":"10.1109/ASPDAC.2013.6509636","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509636","url":null,"abstract":"One of the necessary requirements for the placement process is that it should be capable of generating routable solutions. This paper describes a simple but effective method leading to the reduction of the routing congestion and the final routed wirelength for large-scale mixed-size designs. In order to reduce routing congestion and improve routability, we propose blocking narrow regions on the chip. We also propose dummy-cell insertion inside regions characterized by reduced fixed-macro density. Our placer consists of three major components: (i) narrow channel reduction by performing neighbor-based fixed-macro inflation; (ii) dummy-cell insertion inside large regions with reduced fixed-macro density; and (iii) pre-placement inflation by detecting tangled logic structures in the netlist and minimizing the maximum pin density. We evaluated the quality of our placer using the newly released DAC 2012 routability-driven placement contest designs and we compared our results to the top four teams that participated in the placement contest. The experimental results reveal that our placer improves the routability of the DAC 2012 placement contest designs and effectively reduces the routing congestion.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"130 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116578379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On real-time STM concurrency control for embedded software with improved schedulability","authors":"Mohammed El-Shambakey, B. Ravindran","doi":"10.1109/ASPDAC.2013.6509557","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509557","url":null,"abstract":"We consider software transactional memory (STM) concurrency control for embedded multicore real-time software, and present a novel contention manager for resolving transactional conflicts, called PNF. We upper bound transactional retries and task response times. Our implementation in RSTM/real-time Linux reveals that PNF yields shorter or comparable retry costs than competitors.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121972218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Network flow modeling for escape routing on staggered pin arrays","authors":"Pei-Ci Wu, Martin D. F. Wong","doi":"10.1109/ASPDAC.2013.6509595","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509595","url":null,"abstract":"Recently staggered pin arrays are introduced for modern designs with high pin density. Although some studies have been done on escape routing for hexagonal arrays, the hexagonal array is only a special kind of staggered pin array. There exist other kinds of staggered pin arrays in current industrial designs, and the existing works cannot be extended to solve them. In this paper, we study the escape routing problem on staggered pin arrays. Network flow models are proposed to correctly model the capacity constraints of staggered pin arrays. Our models are guaranteed to find an escape routing satisfying the capacity constraints if there exists one. The correctness of these models lead to an optimal algorithm.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116009631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shared cache aware task mapping for WCRT minimization","authors":"Huping Ding, Yun Liang, T. Mitra","doi":"10.1109/ASPDAC.2013.6509688","DOIUrl":"https://doi.org/10.1109/ASPDAC.2013.6509688","url":null,"abstract":"The Worst-Case Response Time (WCRT) of multi-tasking applications running on multi-cores is an important metric for real-time embedded systems. The WCRT is determined by the mapping of the tasks to the cores (which determines load balancing) and the Worst-Case Execution Time (WCET) of the tasks. However, the WCET of a task is also influenced by the conflicts in the shared cache from concurrently executing tasks on other cores in a multi-core system. In other words, the mapping of the tasks to the cores indirectly influences the WCET of the tasks, which in turn impacts the WCRT of the entire application. Thus the mapping of the tasks to the cores should simultaneously maximize workload balance and minimize shared cache interference. We propose an integer-linear programming (ILP) formulation to achieve this objective. Experimental evaluation shows that shared cache aware task mapping achieves on an average 25% and 33% WCRT reduction for real-life and synthetic applications, respectively, compared to traditional approach that is agnostic to shared cache conflicts and solely focuses on load balancing.","PeriodicalId":297528,"journal":{"name":"2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126251042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}