Youhei Umeki, Koji Yanagida, S. Yoshimoto, S. Izumi, M. Yoshimoto, H. Kawaguchi, K. Tsunoda, T. Sugii
{"title":"A Counter-based Read Circuit Tolerant to Process Variation for 0.4-V Operating STT-MRAM","authors":"Youhei Umeki, Koji Yanagida, S. Yoshimoto, S. Izumi, M. Yoshimoto, H. Kawaguchi, K. Tsunoda, T. Sugii","doi":"10.2197/ipsjtsldm.9.79","DOIUrl":"https://doi.org/10.2197/ipsjtsldm.9.79","url":null,"abstract":"The capacity of embedded memory on LSIs has kept increasing. It is important to reduce the leakage power of embedded memory for low-power LSIs. In fact, the ITRS predicts that the leakage power in embedded memory will account for 40% of all power consumption by 2024 [1]. A spin transfer torque magneto-resistance random access memory (STT-MRAM) is promising for use as non-volatile memory to reduce the leakage power. It is useful because it can function at low voltages and has a lifetime of over 1016 write cycles [2]. In addition, the STT-MRAM technology has a smaller bit cell than an SRAM. Making the STT-MRAM is suitable for use in high-density products [3–7]. The STT-MRAM uses magnetic tunnel junction (MTJ). The MTJ has two states: a parallel state and an anti-parallel state. These states mean that the magnetization direction of the MTJ’s layers are the same or different. The directions pair determines the MTJ’s magneto- resistance value. The states of MTJ can be changed by the current flowing. The MTJ resistance becomes low in the parallel state and high in the anti-parallel state. The MTJ potentially operates at less than 0.4 V [8]. In other hands, it is difficult to design peripheral circuitry for an STT-MRAM array at such a low voltage. In this paper, we propose a counter-based read circuit that functions at 0.4 V, which is tolerant of process variation and temperature fluctuation.","PeriodicalId":38964,"journal":{"name":"IPSJ Transactions on System LSI Design Methodology","volume":"45 1","pages":"79-83"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83938763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Sutisna, Reina Hongyo, L. Lanante, Y. Nagao, M. Kurosaki, H. Ochi
{"title":"Unified HW/SW Co-Verification Methodology for High Throughput Wireless Communication System","authors":"N. Sutisna, Reina Hongyo, L. Lanante, Y. Nagao, M. Kurosaki, H. Ochi","doi":"10.2197/ipsjtsldm.9.61","DOIUrl":"https://doi.org/10.2197/ipsjtsldm.9.61","url":null,"abstract":"","PeriodicalId":38964,"journal":{"name":"IPSJ Transactions on System LSI Design Methodology","volume":"24 1","pages":"61-71"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84597438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Placement: From Wirelength to Detailed Routability","authors":"Wing-Kai Chow, Evangeline F. Y. Young","doi":"10.2197/ipsjtsldm.9.2","DOIUrl":"https://doi.org/10.2197/ipsjtsldm.9.2","url":null,"abstract":"Research on the placement problem in physical design has evolved timely in the recent few decades from traditional wirelength-driven, to routability-driven and then to detailed-routability driven. In this paper, we will focus on the interconnect and routing issues in placement, and study and survey on the development and progress of related works on this important problem.","PeriodicalId":38964,"journal":{"name":"IPSJ Transactions on System LSI Design Methodology","volume":"23 1","pages":"2-12"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89667345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting Arithmetic Optimization Opportunities for C Compilers by Randomly Generated Equivalent Programs","authors":"Atsushi Hashimoto, N. Ishiura","doi":"10.2197/ipsjtsldm.9.21","DOIUrl":"https://doi.org/10.2197/ipsjtsldm.9.21","url":null,"abstract":"This paper presents new methods of detecting missed arithmetic optimization opportunities for C compilers by random testing. For each iteration of random testing, two equivalent programs are generated, where the arithmetic expressions in the second program are more optimized in the C program level. By comparing the two assembly codes compiled from the two C programs, lack of optimization on either of the programs is detected. This method is further extended for detecting erroneous or insufficient optimization involving volatile variables. Two random programs differing only on the initial values for volatile variables are generated, and the resulting assembly codes are compared. Random test systems implemented based on the proposed methods have detected missed optimization opportunities on several compilers, including the latest development versions of GCC-5.0.0 and LLVM/Clang-3.6.","PeriodicalId":38964,"journal":{"name":"IPSJ Transactions on System LSI Design Methodology","volume":"24 1","pages":"21-29"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73904077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Calibration Technique for DVMC with Delay Time Controllable Inverter","authors":"Ri Cui, K. Namba","doi":"10.2197/ipsjtsldm.9.30","DOIUrl":"https://doi.org/10.2197/ipsjtsldm.9.30","url":null,"abstract":": This paper presents a novel calibration method for Delay Value Measurement Circuit (DVMC), a class of embedded time to digital converter (TDC), using a variable clock generator for accurate delay measurement. The proposed method uses a design for calibration as well as a variable clock generator. The design utilizes a delay time controllable (DTC) inverter. It also uses two OR-NAND gates which work as selectors; we reconfigure the construction of the ring oscillator (RO) in DVMC when calibrating the DTC inverter. The proposed scheme accomplishes more accurate calibration compared to the traditional calibration which only uses the variable clock generator. For example, when using a variable clock generator with the resolution of 5.2ps, the resolution of the proposed method is 0.58ps while the traditional method is 5.2ps.","PeriodicalId":38964,"journal":{"name":"IPSJ Transactions on System LSI Design Methodology","volume":"112 1","pages":"30-36"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86232255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Size Optimization Technique for Logic Circuits that Considers BTI and Process Variations","authors":"M. Yabuuchi, Kazutoshi Kobayashi","doi":"10.2197/ipsjtsldm.9.72","DOIUrl":"https://doi.org/10.2197/ipsjtsldm.9.72","url":null,"abstract":"In this paper we outline a transistor size optimization technique for logic circuits that takes into account BTI (Bias Temperature Instability) and process variations. We demonstrate the accuracy of our results with statistical analysis. Since variations have a large impact on the scaling process, dependable circuit designs should include a quantitative analysis if they are to become more reliable in the future. In this study we used an algorithm to prove that with our technique we efficiently lowered the timing margin of the logic path by 4.4% below the margin achieved by conventional techniques. We also observed that the lifetime of the optimized circuits extended without any area overhead.","PeriodicalId":38964,"journal":{"name":"IPSJ Transactions on System LSI Design Methodology","volume":"22 1","pages":"72-78"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83298940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Fast Trace Aware Statistical Based Prediction Model with Burst Traffic Modeling for Contention Stall in A Priority Based MPSoC Bus","authors":"F. Shafiq, T. Isshiki, Dongju Li, H. Kunieda","doi":"10.2197/ipsjtsldm.9.37","DOIUrl":"https://doi.org/10.2197/ipsjtsldm.9.37","url":null,"abstract":": While Multiprocessor System-On-Chips (MPSoCs) are becoming widely adopted in embedded systems, communication architecture analysis for MPSoCs becomes ever more complex. There is a growing need for faster and accurate performance estimation techniques for on-chip bus architecture. This paper presents a novel fast statis- tical based bus stall prediction model that enables estimating the e ff ects of bus-contention stall on the cycle-count of an application program on a subject MPSoC architecture. Our technique fills the gap in existing techniques for bus performance estimation, that are either not accurate enough (e.g. static techniques) or too slow to be used in iterative analysis (e.g. cycle by cycle arbitration simulation on every bus access). First we formulate a model named “single blocking model” that models blocking of a single bus request due to a single bus transfer on another bus master at a time. Furthermore we augment this model with a “burst blocking model” that models bus stall incurred due to burst bus transfers. Together these two models give us a very fast way to predict bus stalls on an MPSoC bus. It is as-sumed that each Processor in the system has a distinct fixed priority, and arbitration is based on priority. The proposed technique makes use of accumulated “workload statistics” to accurately predict the “stall cycle counts” caused due to bus contention. This eliminates the need to simulate arbitration on every bus access, resulting in substantial speed-up. Proposed technique is verified by experiments on applications such as “synthetic tra ffi c generators”, “Newton-Euler dynamic control calculation for the 6-degrees-of-freedom Stanford manipulator benchmark”, “Random sparse matrix solver for electronic circuit simulations benchmark”, “Fast Fourier Transform with 1024 inputs of complex numbers” and “SPEC95 Fpppp which is a chemical program performing multi-electron integral derivatives”. Experimental re- sults show that the proposed method delivers a speed-up factor of 1.33, 1.7, 74 and 6 against the simulation method for the four benchmark applications respectively, while average estimation error is 7% for benchmark application, “Fast Fourier Transform with 1024 inputs of complex numbers” and under 1% for other benchmarks.","PeriodicalId":38964,"journal":{"name":"IPSJ Transactions on System LSI Design Methodology","volume":"46 1","pages":"37-48"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87811144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Amagasaki, Qian Zhao, M. Iida, M. Kuga, T. Sueyoshi
{"title":"A 3D FPGA Architecture to Realize Simple Die Stacking","authors":"M. Amagasaki, Qian Zhao, M. Iida, M. Kuga, T. Sueyoshi","doi":"10.2197/ipsjtsldm.8.116","DOIUrl":"https://doi.org/10.2197/ipsjtsldm.8.116","url":null,"abstract":"To balance between cost and performance, and to explore 3D field-programmable gate array (FPGA) with realistic 3D integration processes, we propose spatially distributed and functionally distributed types of 3D FPGA architectures. The functionally distributed architecture consists of two wafers, a logic layer and a routing layer, and is stacked by a face-down process technology. Since vertical wires pass through microbumps, no TSVs are needed. In contrast, the spatially distributed architecture is divided into multiple layers with the same structure, unlike in the functionally distributed type. This architecture can be expanded to more than two layers by stacking multiples of the same die. The goal of this paper is to elucidate the advantages and disadvantages of these two types of 3D FPGAs. According to our evaluation, when only two layers are used, the functionally distributed architecture is more effective. When higher performance is achieved by using more than two layers, the spatially distributed architecture achieves better performance.","PeriodicalId":38964,"journal":{"name":"IPSJ Transactions on System LSI Design Methodology","volume":"44 1","pages":"116-122"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79379723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-level Synthesis for Low-power Design","authors":"Zhiru Zhang, Deming Chen, Steve Dai, K. Campbell","doi":"10.2197/ipsjtsldm.8.12","DOIUrl":"https://doi.org/10.2197/ipsjtsldm.8.12","url":null,"abstract":"Power and energy efficiency have emerged as first-order design constraints across the computing spectrum from handheld devices to warehouse-sized datacenters. As the number of transistors continues to scale, effectively managing design complexity under stringent power constraints has become an imminent challenge of the IC industry. The manual process of power optimization in RTL design has been increasingly difficult, if not already unsustainable. Complexity scaling dictates that this process must be automated with robust analysis and synthesis algorithms at a higher level of abstraction. Along this line, high-level synthesis (HLS) is a promising technology to improve design productivity and enable new opportunities for power optimization for higher design quality. By allowing early access to the system architecture, high-level decisions during HLS can have a significant impact on the power and energy efficiency of the synthesized design. In this paper, we will discuss the recent research development of using HLS to effectively explore a multi-dimensional design space and derive low-power implementations. We provide an in-depth coverage of HLS low-power optimization techniques and synthesis algorithms proposed in the last decade. We will also describe the key power optimization challenges facing HLS today and outline potential opportunities in tackling these challenges.","PeriodicalId":38964,"journal":{"name":"IPSJ Transactions on System LSI Design Methodology","volume":"422 1","pages":"12-25"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76628856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}