Honglan Jiang, Cong Liu, Naman Maheshwari, F. Lombardi, Jie Han
{"title":"A comparative evaluation of approximate multipliers","authors":"Honglan Jiang, Cong Liu, Naman Maheshwari, F. Lombardi, Jie Han","doi":"10.1145/2950067.2950068","DOIUrl":"https://doi.org/10.1145/2950067.2950068","url":null,"abstract":"A multiplier has a significant impact on the speed and power dissipation of an arithmetic processor. Precise results are not always required in many algorithms, such as those for classification and recognition in data processing. Moreover, many errors do not make an obvious difference in applications such as image processing due to the perceptual limitations of human beings. Error-tolerant algorithms and applications have promoted the development of approximate multipliers to tradeoff accuracy for speed, implementation area and/or power efficiency. This paper briefly reviews the current designs of approximate multipliers and provides a comparative evaluation of their error and circuit characteristics. Image sharpening is performed using the considered approximate multipliers to assess their performance in such applications.","PeriodicalId":213559,"journal":{"name":"2016 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132780243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accelerate context switch by racetrack-SRAM hybrid cells","authors":"Weiqi Zhang, Chao Zhang, Guangyu Sun","doi":"10.1145/2950067.2950070","DOIUrl":"https://doi.org/10.1145/2950067.2950070","url":null,"abstract":"Context switch is an essential feature of modern operating systems. The purpose of context switch is to provide concurrency processing of multiple programs. However, the backup and reload procedures due to context switch are time consuming. In this work, we propose a Racetrack-memory SRAM hybrid (RSH) cell design, which can be used to replace current SRAM cells in caches, to reduce the overhead of context switch. Using RSH reduces the overhead of context switch by 65.2% on average with 34% area overhead, compared with traditional SRAM designs.","PeriodicalId":213559,"journal":{"name":"2016 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125324332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of spin-Hall-assisted STT-MRAM for cache replacement","authors":"Liang Chang, Zhaohao Wang, Yuqian Gao, W. Kang, Youguang Zhang, Weisheng Zhao","doi":"10.1145/2950067.2950107","DOIUrl":"https://doi.org/10.1145/2950067.2950107","url":null,"abstract":"Emerging spin orbit torque (SOT) promises to achieve high-speed write operation for magnetoresistive random access memory (MRAM) since it can eliminate the incubation delay of the conventional spin transfer torque (STT). Such a speed improvement allows the MRAM to be used as low-level cache in the computer architecture. Among various SOT technologies, spin-Hall-assisted STT is a potential candidate thanks to its magnetic-field-free benefit. In this work, we evaluate the potential of the spin-Hall-assisted STT-MRAM in the cache replacement. Firstly, the bit-cell parameters are obtained from the circuit-level simulation. Then, the cache evaluation and system-level simulation are performed with NVSim and Gem5 simulators. Simulation results validate the advantage of the spin-Hall-assisted STT in the write speed and energy. Moreover, for the large capacity (about >2 MB), the spin-Hall-assisted STT-MRAM is a competitive candidate for replacing the conventional SRAM-based cache.","PeriodicalId":213559,"journal":{"name":"2016 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131584194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoyang Wang, Chao Zhang, Xian Zhang, Guangyu Sun
{"title":"np-ECC: Nonadjacent position error correction code for racetrack memory","authors":"Xiaoyang Wang, Chao Zhang, Xian Zhang, Guangyu Sun","doi":"10.1145/2950067.2950082","DOIUrl":"https://doi.org/10.1145/2950067.2950082","url":null,"abstract":"Racetrack memory is a promising non-volatile memory because of its ultra-high storage density. The data are stored along the tape-like cell, where a “shift” operation is used to move the data in a cell back and forth to be accessed. Shift operations suffer from “position error”, where the shift distance is incorrect. Previous work solved the error by position error correction code (p-ECC). However, a bit error within the p-ECC bits will fail the correction mechanism. To protect p-ECC bits from bit errors, we propose a new mapping method for p-ECC, called nonadjacent position error correction code (np-ECC) in this paper. Evaluation shows significant reduction on correction mechanism failure rate.","PeriodicalId":213559,"journal":{"name":"2016 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117145758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Ouyang, S. Yin, Chunxiao Xing, Leibo Liu, Shaojun Wei
{"title":"Energy management on DVS based coarse-grained reconfigurable platform","authors":"P. Ouyang, S. Yin, Chunxiao Xing, Leibo Liu, Shaojun Wei","doi":"10.1145/2950067.2950097","DOIUrl":"https://doi.org/10.1145/2950067.2950097","url":null,"abstract":"The coarse-grained reconfigurable architecture (CGRA) is a promising platform for mobile computing. In this work, based on the battery nonlinear effects, we propose a method to achieve co-optimization of task partition and multi-cell battery scheduling with dynamical voltage scaling (DVS) for CGRA computing platform. Experimental results show that average 33.6% improvement in battery runtime over the current methods is achieved.","PeriodicalId":213559,"journal":{"name":"2016 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116284161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deming Zhang, L. Zeng, Youguang Zhang, Weisheng Zhao, Jacques-Olivier Klein
{"title":"Stochastic spintronic device based synapses and spiking neurons for neuromorphic computation","authors":"Deming Zhang, L. Zeng, Youguang Zhang, Weisheng Zhao, Jacques-Olivier Klein","doi":"10.1145/2950067.2950105","DOIUrl":"https://doi.org/10.1145/2950067.2950105","url":null,"abstract":"Spintronics devices such as magnetic tunnel junction (MTJ) have been investigated for the neuromorphic computation. However, there are still a number of challenges for hardware implementation of the bio-inspired computing, for instance how to use the binary MTJ to mimic the analog synapse. In this paper, a compound scheme is firstly proposed, which employs multiple MTJs connected in parallel operating in the stochastic regime to jointly behave a single synapse, aiming to achieve an analog-like weight spectrum. To further exploit its stochastic switching property for the bio-inspired computing, we present a MTJ based stochastic spiking neuron (SSN) circuit, which can also realize the neural rate coding scheme. A case study is made on the MNIST database for handwritten digital recognition with the proposed compound magnetoresistive synapse (CMS) and SSN. System-level simulation results show that the proposed CMS and SSN can implement neuromorphic computation with high accuracy and immunity to device variation.","PeriodicalId":213559,"journal":{"name":"2016 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122156743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonvolatile online CMOS trimming with magnetic tunnel junctions","authors":"S. Dutta, Michael Price, M. Baldo","doi":"10.1145/2950067.2950091","DOIUrl":"https://doi.org/10.1145/2950067.2950091","url":null,"abstract":"We design a programmable output delay for digital circuits, making use of the nonvolatile bit storage in magnetic tunnel junctions (MTJs), which are available in hybrid processes with CMOS. We introduce our programmable clock buffers in VLSI dot product and fast Fourier transform (FFT) circuits at the 45 nm node. We reduce the FFT clock skew by 39%. These performance improvements and the ability to reduce timing violations come at less than a 1% increase in area and power.","PeriodicalId":213559,"journal":{"name":"2016 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132818563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bo Yang, E. Popovici, M. Quille, A. Amann, S. Cotofana
{"title":"A supply voltage-dependent variation aware reliability evaluation model","authors":"Bo Yang, E. Popovici, M. Quille, A. Amann, S. Cotofana","doi":"10.1145/2950067.2950089","DOIUrl":"https://doi.org/10.1145/2950067.2950089","url":null,"abstract":"With the continuous scaling of CMOS VLSI technology well into the nano-meter regime, and the increasing demand for ultra low power/low voltage circuits and systems, reliability is becoming an extra design optimisation goal in addition to size, performance, and energy. In this paper, a supply voltage (Vdd-) dependent, transistor threshold voltage variation aware propagation delay estimation model and a comprehensive statistical model to evaluate the reliability of the VLSI circuits is proposed. This accurate Vdd-dependent reliability evaluation model can be applied in the process of reliability driven multi-objective optimisation, which deals with tradeoffs between reliability, area, performance and energy. The experimental results show that the average estimation error is within 3% when compared to Monte-Carlo SPICE simulation while saving runtime by at least 100 times for generic benchmark circuits.","PeriodicalId":213559,"journal":{"name":"2016 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"2 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117337104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eleonora Testa, Mathias Soeken, O. Zografos, L. Amarù, P. Raghavan, R. Lauwereins, P. Gaillardon, G. Micheli
{"title":"Inversion optimization in Majority-Inverter Graphs","authors":"Eleonora Testa, Mathias Soeken, O. Zografos, L. Amarù, P. Raghavan, R. Lauwereins, P. Gaillardon, G. Micheli","doi":"10.1145/2950067.2950072","DOIUrl":"https://doi.org/10.1145/2950067.2950072","url":null,"abstract":"Many emerging nanotechnologies realize majority gates as primitive building blocks and they benefit from a majority-based synthesis. Recently, Majority-Inverter Graphs (MIGs) have been introduced to abstract these new technologies. We present optimization techniques for MIGs that aim at rewriting the complemented edges of the graph without changing its shape. We demonstrate the performance of our optimization techniques by considering three cases of emerging technology design: semi-custom digital design using Spin Wave Devices (SWDs) and Quantum-Dot Cellular Automata (QCA); and logic in-memory operation within Resistive Random Access Memories (RRAMs). Our experimental results show that SWD and QCA technologies benefit from complemented edges minimization. Area, delay, and power of SWD-based circuits are improved by 13.8%, 21.1%, and 9.2% respectively, while the number of QCA cells in QCA-based circuits can be decreased by 4.9% on average. Reductions of 14.4% and 12.4% in the number of devices and sequential steps respectively can be achieved for RRAMs when the number of nodes with exactly one complemented input is increased during MIG optimization.","PeriodicalId":213559,"journal":{"name":"2016 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131693313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Levisse, B. Giraud, J. Noel, M. Moreau, J. Portal
{"title":"Capacitor based SneakPath compensation circuit for transistor-less ReRAM architectures","authors":"A. Levisse, B. Giraud, J. Noel, M. Moreau, J. Portal","doi":"10.1145/2950067.2950073","DOIUrl":"https://doi.org/10.1145/2950067.2950073","url":null,"abstract":"With the arrival of crosspoint based memories on the consumer market, high-density resistive memories could be introduced as flash memories replacement or as storage class memory. However, transistor-Less Resistive memory architectures using 1Selector-1resistance bitcells suffer from performances loss due to sneaking current through unselected bitcells. Beyond the back end of line selector design, circuit design solutions have to be pushed in order to improve precision during programming steps. In this paper we propose a novel capacitor based 2-steps SneakPath compensation circuit for transistor-less architectures of resistive memories. Compared to standard SneakPath compensation circuits, it ensures up to 20× of area improvement and more than 3× reduction of the variability effects for a 28nm CMOS node.","PeriodicalId":213559,"journal":{"name":"2016 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130860986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}