F. Riente, G. Ziemys, G. Turvani, D. Schmitt-Landsiedel, S. B. Gamm, M. Graziano
{"title":"Towards Logic-In-Memory circuits using 3D-integrated Nanomagnetic logic","authors":"F. Riente, G. Ziemys, G. Turvani, D. Schmitt-Landsiedel, S. B. Gamm, M. Graziano","doi":"10.1109/ICRC.2016.7738700","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738700","url":null,"abstract":"Perpendicular Nanomagnetic logic (pNML) is one emerging beyond-CMOS technology listed in the ITRS roadmap for next-generation computing due to its non-volatility, monolithic 3D-Integration, small size scalability and low power consumption. Here, we demonstrate the feasibility of a monolithic 3D pNML circuit, which is capable of integrating both memory and logic onto the same device on different layers by exploiting the novel Logic-In-Memory (LIM) concept. The LIM can be exploited by placing magnetic memory elements (registers) in a memory layer, which is located monolithically just below the performing logic plane and interconnected by pure-magnetic vias. In particular, the nonvolatile magnetization state of the bistable, nanoscaled magnets with perpendicular magnetic anisotropy is exploited to build a magnetic D flip-flop. This basic memory element is then used to build a more compact and a more power efficient N-bit parallel-in parallel-out register. Indeed, the presented magnetic flip-flop implementation is two orders of magnitude more compact when compared to the 32nm CMOS version. The approach has been studied by considering the implementation of an accumulator (adder plus memory) as case study. Moreover, we compare the occupied area of a N-bit accumulator with the 45nm and 28nm CMOS technology nodes. This novel concept enables the storage of information locally on the computing chip, saving area and employing the strengths of pNML for next-generation, memory-intensive computing tasks.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122472761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Opportunities in physical computing driven by analog realization","authors":"J. Hasler","doi":"10.1109/ICRC.2016.7738680","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738680","url":null,"abstract":"In the past, discussions on the capability of analog or physical computing were only of theoretical interest. Digital computation's 80 year history starts from the Turings original model of computation to ubiquitous modern computational devices. The modern development of analog computation started with almost zero computational framework. Today, we have significant programmable and configurable physical computing systems. The focus of this paper is to have these discussions given the very real potential of ultra-low power physical computing systems. This work considers the current state of analog computation, energy efficient computation, and analog numerical analysis, moving towards starting a unified analog-computing framework, including quantum computing, as part of physical computing.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132725234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Agarwal, Jeanine E. Cook, E. Debenedictis, M. Frank, G. Cauwenberghs, S. Srikanth, Bobin Deng, Eric R. Hein, Paul G. Rabbat, T. Conte
{"title":"Energy efficiency limits of logic and memory","authors":"S. Agarwal, Jeanine E. Cook, E. Debenedictis, M. Frank, G. Cauwenberghs, S. Srikanth, Bobin Deng, Eric R. Hein, Paul G. Rabbat, T. Conte","doi":"10.1109/ICRC.2016.7738676","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738676","url":null,"abstract":"We address practical limits of energy efficiency scaling for logic and memory. Scaling of logic will end with unreliable operation, making computers probabilistic as a side effect. The errors can be corrected or tolerated, but overhead will increase with further scaling. We address the tradeoff between scaling and error correction that yields minimum energy per operation, finding new error correction methods with energy consumption limits about 2× below current approaches. The maximum energy efficiency for memory depends on several other factors. Adiabatic and reversible methods applied to logic have promise, but overheads have precluded practical use. However, the regular array structure of memory arrays tends to reduce overhead and makes adiabatic memory a viable option. This paper reports an adiabatic memory that has been tested at about 85× improvement over standard designs for energy efficiency. Combining these approaches could set energy efficiency expectations for processor-in-memory computing systems.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125023118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A functional architecture for scalable quantum computing","authors":"E. Sete, W. Zeng, C. Rigetti","doi":"10.1109/ICRC.2016.7738703","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738703","url":null,"abstract":"Quantum computing devices based on superconducting quantum circuits have rapidly developed in the last few years. The building blocks-superconducting qubits, quantum-limited amplifiers, and two-qubit gates-have been demonstrated by several groups. Small prototype quantum processor systems have been implemented with performance adequate to demonstrate quantum chemistry simulations, optimization algorithms, and enable experimental tests of quantum error correction schemes. A major bottleneck in the effort to develop larger systems is the need for a scalable functional architecture that combines all the core building blocks in a single, scalable technology. We describe such a functional architecture, based on a planar lattice of transmon and fluxonium qubits, parametric amplifiers, and a novel fast DC controlled two-qubit gate.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121545366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Agrawal, Jungwook Choi, K. Gopalakrishnan, Suyog Gupta, R. Nair, Jinwook Oh, D. Prener, Sunil Shukla, V. Srinivasan, Zehra Sura
{"title":"Approximate computing: Challenges and opportunities","authors":"A. Agrawal, Jungwook Choi, K. Gopalakrishnan, Suyog Gupta, R. Nair, Jinwook Oh, D. Prener, Sunil Shukla, V. Srinivasan, Zehra Sura","doi":"10.1109/ICRC.2016.7738674","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738674","url":null,"abstract":"Approximate computing is gaining traction as a computing paradigm for data analytics and cognitive applications that aim to extract deep insight from vast quantities of data. In this paper, we demonstrate that multiple approximation techniques can be applied to applications in these domains and can be further combined together to compound their benefits. In assessing the potential of approximation in these applications, we took the liberty of changing multiple layers of the system stack: architecture, programming model, and algorithms. Across a set of applications spanning the domains of DSP, robotics, and machine learning, we show that hot loops in the applications can be perforated by an average of 50% with proportional reduction in execution time, while still producing acceptable quality of results. In addition, the width of the data used in the computation can be reduced to 10-16 bits from the currently common 32/64 bits with potential for significant performance and energy benefits. For parallel applications we reduced execution time by 50% using relaxed synchronization mechanisms. Finally, our results also demonstrate that benefits compounded when these techniques are applied concurrently. Our results across different applications demonstrate that approximate computing is a widely applicable paradigm with potential for compounded benefits from applying multiple techniques across the system stack. In order to exploit these benefits it is essential to re-think multiple layers of the system stack to embrace approximations ground-up and to design tightly integrated approximate accelerators. Doing so will enable moving the applications into a world in which the architecture, programming model, and even the algorithms used to implement the application are all fundamentally designed for approximate computing.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126455400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
William M. Severa, Ojas D. Parekh, Kristofor D. Carlson, C. James, J. Aimone
{"title":"Spiking network algorithms for scientific computing","authors":"William M. Severa, Ojas D. Parekh, Kristofor D. Carlson, C. James, J. Aimone","doi":"10.1109/ICRC.2016.7738681","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738681","url":null,"abstract":"For decades, neural networks have shown promise for next-generation computing, and recent breakthroughs in machine learning techniques, such as deep neural networks, have provided state-of-the-art solutions for inference problems. However, these networks require thousands of training processes and are poorly suited for the precise computations required in scientific or similar arenas. The emergence of dedicated spiking neuromorphic hardware creates a powerful computational paradigm which can be leveraged towards these exact scientific or otherwise objective computing tasks. We forego any learning process and instead construct the network graph by hand. In turn, the networks produce guaranteed success often with easily computable complexity. We demonstrate a number of algorithms exemplifying concepts central to spiking networks including spike timing and synaptic delay. We also discuss the application of cross-correlation particle image velocimetry and provide two spiking algorithms; one uses time-division multiplexing, and the other runs in constant time.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130326953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abbas Rahimi, S. Benatti, P. Kanerva, L. Benini, J. Rabaey
{"title":"Hyperdimensional biosignal processing: A case study for EMG-based hand gesture recognition","authors":"Abbas Rahimi, S. Benatti, P. Kanerva, L. Benini, J. Rabaey","doi":"10.1109/ICRC.2016.7738683","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738683","url":null,"abstract":"The mathematical properties of high-dimensional spaces seem remarkably suited for describing behaviors produces by brains. Brain-inspired hyperdimensional computing (HDC) explores the emulation of cognition by computing with hypervectors as an alternative to computing with numbers. Hypervectors are high-dimensional, holographic, and (pseudo)random with independent and identically distributed (i.i.d.) components. These features provide an opportunity for energy-efficient computing applied to cyberbiological and cybernetic systems. We describe the use of HDC in a smart prosthetic application, namely hand gesture recognition from a stream of Electromyography (EMG) signals. Our algorithm encodes a stream of analog EMG signals that are simultaneously generated from four channels to a single hypervector. The proposed encoding effectively captures spatial and temporal relations across and within the channels to represent a gesture. This HDC encoder achieves a high level of classification accuracy (97.8%) with only 1/3 the training data required by state-of-the-art SVM on the same task. HDC exhibits fast and accurate learning explicitly allowing online and continuous learning. We further enhance the encoder to adaptively mitigate the effect of gesture-timing uncertainties across different subjects endogenously; further, the encoder inherently maintains the same accuracy when there is up to 30% overlapping between two consecutive gestures in a classification window.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132850545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Neuromorphic computing with integrated photonics and superconductors","authors":"J. Shainline, S. Buckley, R. Mirin, S. Nam","doi":"10.1109/ICRC.2016.7738694","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738694","url":null,"abstract":"We present a hardware platform combining integrated photonics with superconducting electronics for large-scale neuromorphic computing. Semiconducting few-photon light-emitting diodes work in conjunction with superconducting-nanowire single-photon detectors to behave as spiking neurons. These neurons are connected through a network of waveguides, and variable weights of connection can be implemented using several approaches. These processing units can operate at 20 MHz with fully asynchronous activity, light-speed-limited latency, and power densities on the order of 1 mW/cm2. The processing units achieve an energy efficiency of 20 aJ/synapse event, an improvement of six orders of magnitude over recent CMOS demonstrations [1]. We present calculations showing this approach could scale to interconnectivity near that of the human brain, and could surpass the brain in speed and efficiency.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133491613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Technology considerations for neuromorphic computing","authors":"D. Mountain","doi":"10.1109/ICRC.2016.7738688","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738688","url":null,"abstract":"The use of neural nets has been growing rapidly. A variety of computing architectures, such as CPUs, GPUs, FPGAs, and analog designs have been proposed. This paper will explore how technology options affect design choices, using both digital and analog circuit designs suitable for neural nets.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115164260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bobin Deng, S. Srikanth, Eric R. Hein, Paul G. Rabbat, T. Conte, E. Debenedictis, Jeanine E. Cook
{"title":"Computationally-redundant energy-efficient processing for y'all (CREEPY)","authors":"Bobin Deng, S. Srikanth, Eric R. Hein, Paul G. Rabbat, T. Conte, E. Debenedictis, Jeanine E. Cook","doi":"10.1109/ICRC.2016.7738714","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738714","url":null,"abstract":"Dennard scaling has ended. Lowering the voltage supply (Vdd) to sub volt levels causes intermittent losses in signal integrity, rendering further scaling (down) no longer acceptable as a means to lower the power required by a processor core. However, if it were possible to recover the occasional losses due to lower Vdd in an efficient manner, one could effectively lower power. In other words, by deploying the right amount and kind of redundancy, we can strike a balance between overhead incurred in achieving reliability and savings realized by permitting lower Vdd. One promising approach is the Redundant Residue Number System (RRNS) representation. Unlike other error correcting codes, RRNS has the important property of being closed under addition, subtraction and multiplication. Thus enabling correction of errors caused due to both faulty storage and compute units. Furthermore, the incorporated approach uses a fraction of the overhead and is more efficient when compared to the conventional technique used for compute-reliability. In this article, we provide an overview of the architecture of a CREEPY core that leverages this property of RRNS and discuss associated algorithms such as error detection/correction, arithmetic overflow detection and signed number representation. Finally, we demonstrate the usability of such a computer by quantifying a performance-reliability trade-off and provide a lower bound measure of tolerable input signal energy at a gate, while still maintaining reliability.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131116578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}