{"title":"Message from the 2019 ICRC Program Co-Chairs","authors":"","doi":"10.1109/icrc.2019.8914714","DOIUrl":"https://doi.org/10.1109/icrc.2019.8914714","url":null,"abstract":"","PeriodicalId":297574,"journal":{"name":"2019 IEEE International Conference on Rebooting Computing (ICRC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122304072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Limits of Stochastic Computing","authors":"Florian Neugebauer, I. Polian, J. Hayes","doi":"10.1109/ICRC.2019.8914706","DOIUrl":"https://doi.org/10.1109/ICRC.2019.8914706","url":null,"abstract":"Stochastic computing (SC) provides large benefits in area and power consumption at the cost of limited computational accuracy. For this reason, it has been proposed for computation intensive applications that can tolerate approximate results such as neural networks (NNs) and digital filters. Most system implementations employing SC are referred to as stochastic circuits, even though they can have vastly different properties and limitations. In this work, we propose a distinction between strongly and weakly stochastic circuits, which provide different options and trade-offs for implementing SC operations. On this basis, we investigate some fundamental theoretical and practical limits of SC that have not been considered before. In particular, we analyze the limits of stochastic addition and show via the example of a convolutional NN that these limits can restrict the viability of strongly stochastic systems. We further show that theoretically all non-affine functions do not have exact SC implementations and investigate the practical implications of this discovery.","PeriodicalId":297574,"journal":{"name":"2019 IEEE International Conference on Rebooting Computing (ICRC)","volume":"11 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120928429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiaxin Peng, Y. Alkabani, Shuai Sun, V. Sorger, T. El-Ghazawi
{"title":"Integrated Photonics Architectures for Residue Number System Computations","authors":"Jiaxin Peng, Y. Alkabani, Shuai Sun, V. Sorger, T. El-Ghazawi","doi":"10.1109/ICRC.2019.8914700","DOIUrl":"https://doi.org/10.1109/ICRC.2019.8914700","url":null,"abstract":"Residue number system (RNS) can represent large numbers as sets of relatively smaller prime numbers. Architectures for such systems can be inherently parallel, as arithmetic operations on large numbers can then be performed on elements of those sets individually. As RNS arithmetic is based on modulo operations, an RNS computational unit is usually constructed as a network of switches that are controlled to perform a specific computation, giving rise to the processing in network (PIN) paradigm. In this work, we explore using integrated photonics switches to build different high-speed architectures of RNS computational units based on multistage interconnection networks. The inherent parallelism of RNS, as well as very low energy of integrated phontonics are two primary reasons for the promise of this direction. We study the trade-offs between the area and the control complexity of five different architectures. We show that our newly proposed architecture, which is based on arbitrary size Benes (AS-Benes) networks, saves up to 90% of the area and is up to 16 times faster than the other architectures.","PeriodicalId":297574,"journal":{"name":"2019 IEEE International Conference on Rebooting Computing (ICRC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114624138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Entangled State Preparation for Non-Binary Quantum Computing","authors":"Kaitlin N. Smith, M. Thornton","doi":"10.1109/ICRC.2019.8914717","DOIUrl":"https://doi.org/10.1109/ICRC.2019.8914717","url":null,"abstract":"A common model of quantum computing is the gate model with binary basis states. Here, we consider the gate model of quantum computing with a non-binary radix resulting in more than two basis states to represent a quantum digit, or qudit. Quantum entanglement is an important phenomenon that is a critical component of quantum computation and communications algorithms. The generation and use of entanglement among radix-2 qubits is well-known and used often in quantum computing algorithms. Quantum entanglement exists in higher-radix systems as well although little is written regarding the generation of higher-radix entangled states. We provide background describing the feasibility of multiple-valued logic quantum systems and describe a new systematic method for generating maximally entangled states in quantum systems of dimension greater than two. This method is implemented in a synthesis algorithm that is described. Experimental results are included that demonstrate the transformations needed to create specific forms of maximally entangled quantum states.","PeriodicalId":297574,"journal":{"name":"2019 IEEE International Conference on Rebooting Computing (ICRC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114512708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wen Ma, P. Chiu, Won Ho Choi, Minghai Qin, D. Bedau, Martin Lueker-Boden
{"title":"Non-Volatile Memory Array Based Quantization- and Noise-Resilient LSTM Neural Networks","authors":"Wen Ma, P. Chiu, Won Ho Choi, Minghai Qin, D. Bedau, Martin Lueker-Boden","doi":"10.1109/ICRC.2019.8914713","DOIUrl":"https://doi.org/10.1109/ICRC.2019.8914713","url":null,"abstract":"In cloud and edge computing models, it is important that compute devices at the edge be as power efficient as possible. Long short-term memory (LSTM) neural networks have been widely used for natural language processing, time series prediction and many other sequential data tasks. Thus, for these applications there is increasing need for low-power accelerators for LSTM model inference at the edge. In order to reduce power dissipation due to data transfers within inference devices, there has been significant interest in accelerating vector-matrix multiplication (VMM) operations using non-volatile memory (NVM) weight arrays. In NVM array-based hardware, reduced bit-widths also significantly increases the power efficiency. In this paper, we focus on the application of quantization-aware training algorithm to LSTM models, and the benefits these models bring in terms of resilience against both quantization error and analog device noise. We have shown that only 4-bit NVM weights and 4-bit ADC/DACs are needed to produce equivalent LSTM network performance as floating-point baseline. Reasonable levels of ADC quantization noise and weight noise can be naturally tolerated within our NVM-based quantized LSTM network. Benchmark analysis of our proposed LSTM accelerator for inference has shown at least 2.4× better computing efficiency and 40× higher area efficiency than traditional digital approaches (GPU, FPGA, and ASIC). Some other novel approaches based on NVM promise to deliver higher computing efficiency (up to ×4.7) but require larger arrays with potential higher error rates.","PeriodicalId":297574,"journal":{"name":"2019 IEEE International Conference on Rebooting Computing (ICRC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116990145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Beausoleil, T. Vaerenbergh, Kirk M. Bresniker, Catherine E. Graves, Kimberly Keeton, Suhas Kumar, Can Li, D. Milojicic, S. Serebryakov, J. Strachan
{"title":"Future Computing Systems (FCS) to Support \"Understanding\" Capability","authors":"R. Beausoleil, T. Vaerenbergh, Kirk M. Bresniker, Catherine E. Graves, Kimberly Keeton, Suhas Kumar, Can Li, D. Milojicic, S. Serebryakov, J. Strachan","doi":"10.1109/ICRC.2019.8914712","DOIUrl":"https://doi.org/10.1109/ICRC.2019.8914712","url":null,"abstract":"The massive explosion in data acquisition, processing, and archiving, accelerated by the end of Moore's Law, creates a challenge and an opportunity for a complete redesign of technology, devices, hardware architecture, software stack and AI stack to enable future computing systems with \"understanding\" capability. We propose a Future Computing System (FCS) based on a memory driven computing AI architecture, that leverages different types of next generation accelerators (e.g., Ising and Hopfield Machines), connected over an intelligent successor of the Gen-Z interconnect. On top of this architecture we propose a software stack and subsequently, an AI stack built on top of the software stack. While intelligence characteristics (learning, training, self-awareness, etc.) permeate all layers, we also separate AI-specific components into a separate layer for clear design. There are two aspects of AI in FCSs: a) AI embedded in the system to make the system better: better performing, more robust, self-healing, maintainable, repairable, and energy efficient. b) AI as the level of reasoning over the information contained within the system: the supervised and unsupervised techniques finding relationships over the data placed into the system. Developing the software and AI stack will require adapting to each redundant component. At least initially, specialization will be required. For this reason, starting with an interoperable, memory driven computing architecture and associated interconnect is essential for subsequent generalization. Our architecture is composable, i.e., it could be pursued in: a) its entirety, b) per-layer c) per component inside of the layer (e.g., only one of the accelerators, use cases, etc.); or d) exploring specific characteristics across the layers.","PeriodicalId":297574,"journal":{"name":"2019 IEEE International Conference on Rebooting Computing (ICRC)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127584222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jeffrey S. Young, E. J. Riedy, T. Conte, Vivek Sarkar, Prasanth Chatarasi, S. Srikanth
{"title":"Experimental Insights from the Rogues Gallery","authors":"Jeffrey S. Young, E. J. Riedy, T. Conte, Vivek Sarkar, Prasanth Chatarasi, S. Srikanth","doi":"10.1109/ICRC.2019.8914707","DOIUrl":"https://doi.org/10.1109/ICRC.2019.8914707","url":null,"abstract":"The Rogues Gallery is a new deployment for understanding next-generation hardware with a focus on unorthodox and uncommon technologies. This testbed project was initiated in 2017 in response to Rebooting Computing efforts and initiatives. The Gallery's focus is to acquire new and unique hardware (the rogues) from vendors, research labs, and start-ups and to make this hardware widely available to students, faculty, and industry collaborators within a managed data center environment. By exposing students and researchers to this set of unique hardware, we hope to foster cross-cutting discussions about hardware designs that will drive future performance improvements in computing long after the Moore's Law era of cheap transistors ends. We have defined an initial vision of the infrastructure and driving engineering challenges for such a testbed in a separate document, so here we present highlights of the first one to two years of post-Moore era research with the Rogues Gallery and give an indication of where we see future growth for this testbed and related efforts.","PeriodicalId":297574,"journal":{"name":"2019 IEEE International Conference on Rebooting Computing (ICRC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114914007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sachin S. Bhat, Csaba Andras, Moritz, M. Inuiguchi, Hirosato, Seki, S. Serebryakov, D. Milojicic, Natalia, Vassilieva, S. Fleischman, D. Bedau, Craig Warner, C. Brueggen, Charles, Williams, Nathaniel Jansen, J. Strachan, Amit, Sharma
{"title":"ICRC 2019 Technical Program","authors":"Sachin S. Bhat, Csaba Andras, Moritz, M. Inuiguchi, Hirosato, Seki, S. Serebryakov, D. Milojicic, Natalia, Vassilieva, S. Fleischman, D. Bedau, Craig Warner, C. Brueggen, Charles, Williams, Nathaniel Jansen, J. Strachan, Amit, Sharma","doi":"10.1109/icrc.2019.8914695","DOIUrl":"https://doi.org/10.1109/icrc.2019.8914695","url":null,"abstract":"Session 1 Machine Learning Systems Reconfigurable Probabilistic AI Architecture for Personalized Cancer Treatment Sourabh Kulkarni, Sachin Bhat, Csaba Andras Moritz (University of Massachusetts Amherst) On a Learning Method of the SIC Fuzzy Inference Model with Consequent Fuzzy Sets Genki Ohashi, Masahiro Inuiguchi, Hirosato Seki (Osaka University) Deep Learning Cookbook: Recipes for your AI Infrastructure and Applications Sergey Serebryakov, Dejan Milojicic, Natalia Vassilieva, Stephen Fleischman, Robert Clark (Cerebras, Hewlett Packard Enterprise)","PeriodicalId":297574,"journal":{"name":"2019 IEEE International Conference on Rebooting Computing (ICRC)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127097916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rene Celis-Cordova, A. Orlov, Tian Lu, J. Kulick, G. Snider
{"title":"Design of a 16-Bit Adiabatic Microprocessor","authors":"Rene Celis-Cordova, A. Orlov, Tian Lu, J. Kulick, G. Snider","doi":"10.1109/ICRC.2019.8914699","DOIUrl":"https://doi.org/10.1109/ICRC.2019.8914699","url":null,"abstract":"Heat production is one of the main limiting factors in modern computing. In this paper, we explore adiabatic reversible logic which can dramatically reduce energy dissipation and is a viable implementation of future energy-efficient computing. We present a 16-bit adiabatic microprocessor with a multicycle MIPS architecture designed in 90nm technology. The adiabatic circuits are implemented using split-rail charge recovery logic, which allows the same circuit to be operated both in adiabatic mode and in standard CMOS mode. Simulations of a shift register show that energy dissipation can be much lower when operating in adiabatic mode compared to its CMOS counterpart. We present a standard cell library with all the necessary components to build adiabatic circuits and implement the subsystems of the microprocessor. The microprocessor has a proposed operating frequency of 0.5 GHz representing a useful implementation of adiabatic reversible computing.","PeriodicalId":297574,"journal":{"name":"2019 IEEE International Conference on Rebooting Computing (ICRC)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123460287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}