{"title":"Reducing the Size of Spiking Convolutional Neural Networks by Trading Time for Space","authors":"J. Plank, Jiajia Zhao, Brent Hurst","doi":"10.1109/ICRC2020.2020.00010","DOIUrl":"https://doi.org/10.1109/ICRC2020.2020.00010","url":null,"abstract":"Spiking neural networks are attractive alternatives to conventional neural networks because of their ability to implement complex algorithms with low power and network complexity. On the flip side, they are difficult to train to solve specific problems. One approach to training is to train conventional neural networks with binary threshold activation functions, which may then be implemented with spikes. This is a powerful approach. However, when applied to neural networks with convolutional kernels, the spiking networks explode in size. In this work, we design multiple spiking computational modules, which reduce the size of the networks back to size of the conventional networks. They do so by taking advantage of the temporal nature of spiking neural networks. We evaluate the size reduction analytically and on classification examples. Finally, we compare and confirm the classification accuracy of their implementation on a discrete threshold neuroprocessor.","PeriodicalId":320580,"journal":{"name":"2020 International Conference on Rebooting Computing (ICRC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121075176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classical Adiabatic Annealing in Memristor Hopfield Neural Networks for Combinatorial Optimization","authors":"Suhas Kumar, T. Vaerenbergh, J. Strachan","doi":"10.1109/ICRC2020.2020.00016","DOIUrl":"https://doi.org/10.1109/ICRC2020.2020.00016","url":null,"abstract":"There is an intense search for supplements to digital computer processors to solve computationally hard problems, such as gene sequencing. Quantum computing has gained popularity in this search, which exploits quantum tunneling to achieve adiabatic annealing. However, quantum annealing requires very low temperatures and precise control, which lead to unreasonably high costs. Here we show via simulations, alongside experimental instantiations, that computational advantages qualitatively similar to those gained by quantum annealing can be achieved at room temperature in classical systems by using a memristor Hopfield neural network to solve computationally hard problems.","PeriodicalId":320580,"journal":{"name":"2020 International Conference on Rebooting Computing (ICRC)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122048985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adiabatic Circuits for Quantum Computer Control","authors":"E. Debenedictis","doi":"10.1109/ICRC2020.2020.00004","DOIUrl":"https://doi.org/10.1109/ICRC2020.2020.00004","url":null,"abstract":"How far can quantum computers scale up? Quantum computers have more qubits with longer lifetimes than ever before, yet experimentalists report a scaling limit around 1,000 qubits due to heat dissipation in the classical control system. This paper introduces new classical circuits and architectures that will reduce this dissipation by exploiting the heat difference between room temperature and the cryogenic environment. In lieu of using just cryo CMOS or Single Flux Quantum (SFQ) Josephson junctions (JJs), this paper focuses on cryogenic adiabatic transistor circuits (CATC), which use the same transistors as CMOS and are clocked on a ladder of clock rates, enabling the circuits to exploit varying energy-delay tradeoffs to increase energy efficiency. These design principles could lead to a scale up path for quantum computers that combines aspects of Moore’s law with the principles of quantum speedup.","PeriodicalId":320580,"journal":{"name":"2020 International Conference on Rebooting Computing (ICRC)","volume":"34 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129209523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Alien vs. Predator: Brain Inspired Sparse Coding Optimization on Neuromorphic and Quantum Devices","authors":"Kyle Henke, Benjamin Migliori, Garrett T. Kenyon","doi":"10.1109/ICRC2020.2020.00015","DOIUrl":"https://doi.org/10.1109/ICRC2020.2020.00015","url":null,"abstract":"Machine Learning has achieved immense progress by exploiting CPUs and GPUs on classical computing hardware. However, the inevitable end of Moore’s Law on these devices requires the adaptation and exploration of novel computational platforms in order to continue these advancements. Biologically accurate, energy efficient neuromorphic systems and fully en-tangled quantum systems are particularly promising arenas for enabling future advances. In this work, we perform a detailed comparison on a level playing field between these two novel substrates by applying them to an identical challenge.We solve the sparse coding problem using the biologically inspired Locally Competitive Algorithm (LCA) on the D-Wave quantum annealer and Intel Loihi neuromorphic spiking processor. The Fashion-MNIST data set was chosen and dimensionally-reduced by sparse Principal Component Analysis (sPCA). A sign flipped second data set was created and appended to the original in order to give each class a mean zero distribution, effectively creating an environment where the data could not be linearly separated. An early in time normalization technique for Loihi is presented along with analysis of optimal parameter selection and unsupervised dictionary learning for all three variations. Studies are ongoing, but preliminary results suggest each computational substrate requires casting the NP-Hard optimization problem in a slightly different manner to best capture the individual strengths, and the new Loihi method allows for more realistic comparison between the two.","PeriodicalId":320580,"journal":{"name":"2020 International Conference on Rebooting Computing (ICRC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128483642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Shiomi, T. Ishihara, H. Onodera, A. Shinya, M. Notomi
{"title":"An Optical Accelerator for Deep Neural Network Based on Integrated Nanophotonics","authors":"Jun Shiomi, T. Ishihara, H. Onodera, A. Shinya, M. Notomi","doi":"10.1109/ICRC2020.2020.00017","DOIUrl":"https://doi.org/10.1109/ICRC2020.2020.00017","url":null,"abstract":"The emergence of nanophotonic devices has enabled to design light-speed on-chip optical circuits with extremely low latency. This paper proposes an optical implementation of scalable Deep Neural Networks (DNNs) enabling light-speed inference. The key issue in optical neural networks is the scalability limited by area, power and the number of available wavelengths. Due to the scalability, it is thus difficult to design an all-optical hardware accelerator for a large-scale DNN. To solve this problem, this paper firstly proposes an optical Vector Matrix Multiplier (VMM) structure operating with a low latency. The multipliers in a VMM are highly parallelized based on the Wavelength Division Multiplexing (WDM) technique, which reduces the area overhead without sacrificing the ultra-high speed nature. This paper then proposes the electrical digital interfaces for storing and handling intermediate VMM data without sacrificing the ultra-high speed nature, which enables to reuse the VMM multiple times with a low latency.","PeriodicalId":320580,"journal":{"name":"2020 International Conference on Rebooting Computing (ICRC)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117306128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Why Reliability for Computing Needs Rethinking","authors":"Valeriu Beiu, V. Dragoi, Roxana-Mariana Beiu","doi":"10.1109/ICRC2020.2020.00006","DOIUrl":"https://doi.org/10.1109/ICRC2020.2020.00006","url":null,"abstract":"Offering high quality services/products has been of paramount importance for both communications and computations. Early on, both of these were in dire need of practical designs for enhancing reliability. That is why John von Neumann proposed the first gate-level method (using redundancy to build reliable systems from unreliable components), while Edward F. Moore and Claude E. Shannon followed suit with the first device-level scheme. Moore and Shannon’s prescient paper also established network reliability as a probabilistic model where the nodes of the network were considered to be perfectly reliable, while the edges could fail independently with a certain probability. The fundamental problem was that of estimating the probability that (under given conditions) two (or more) nodes are connected, the solution being represented by the well-known reliability polynomial (of the network). This concept has been heavily used for communications, where big strides were made and applied to networks of: roads, railways, power lines, fiber optics, phones, sensors, etc. For computations the research community converged on the gate-level method proposed by von Neumann, while the device-level scheme crafted by Moore and Shannon—although very practical and detailed—did not inspire circuit designers and went under the radar. That scheme was built on a thought-provoking network called hammock, exhibiting regular brick-wall near-neighbor connections. Trying to do justice to computing networks in general (and hammocks in particular), this paper aims to highlight and clarify how reliable different types of networks are when they are intended for performing computations. For doing this, we will define quite a few novel cost functions which, together with established ones, will allow us to meticulously compare different types of networks for a clearer understanding of the reliability enhancements they are able to bring to computations. To our knowledge, this is the first ever ranking of networks with respect to computing reliability. The main conclusion is that a rethinking/rebooting of how should we design reliable computing systems, immediately applicable to networks/arrays of devices (e.g., transistors or qubits), is both timely and needed.","PeriodicalId":320580,"journal":{"name":"2020 International Conference on Rebooting Computing (ICRC)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124550085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Kulkarni, A. Tsyplikhin, M. M. Krell, C. A. Moritz
{"title":"Accelerating Simulation-based Inference with Emerging AI Hardware","authors":"S. Kulkarni, A. Tsyplikhin, M. M. Krell, C. A. Moritz","doi":"10.1109/ICRC2020.2020.00003","DOIUrl":"https://doi.org/10.1109/ICRC2020.2020.00003","url":null,"abstract":"Developing models of natural phenomena by capturing their underlying complex interactions is a core tenet of various scientific disciplines. These models are useful as simulators and can help in understanding the natural processes being studied. One key challenge in this pursuit has been to enable statistical inference over these models, which would allow these simulation-based models to learn from real-world observations. Recent efforts, such as Approximate Bayesian Computation (ABC), show promise in performing a new kind of inference to leverage these models. While the scope of applicability of these inference algorithms is limited by the capabilities of contemporary computational hardware, they show potential of being greatly parallelized. In this work, we explore hardware accelerated simulation-based inference over probabilistic models, by combining massively parallelized ABC inference algorithm with the cutting-edge AI chip solutions that are uniquely suited for this purpose. As a proof-of-concept, we demonstrate inference over a probabilistic epidemiology model used to predict the spread of COVID-19. Two hardware acceleration platforms are compared - the Tesla V100 GPU and the Graphcore Mark1 IPU. Our results show that while both of these platforms outperform multi-core CPUs, the Mk1 IPUs are 7.5x faster than the Tesla V100 GPUs for this workload.","PeriodicalId":320580,"journal":{"name":"2020 International Conference on Rebooting Computing (ICRC)","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133639422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jeff Anderson, Engin Kayraklioglu, H. Imani, M. Miscuglio, V. Sorger, T. El-Ghazawi
{"title":"Virtualizing Analog Mesh Computers: The Case of a Photonic PDE Solving Accelerator","authors":"Jeff Anderson, Engin Kayraklioglu, H. Imani, M. Miscuglio, V. Sorger, T. El-Ghazawi","doi":"10.1109/ICRC2020.2020.00008","DOIUrl":"https://doi.org/10.1109/ICRC2020.2020.00008","url":null,"abstract":"Innovative processor architectures play a critical role in sustaining performance improvements under severe limitations imposed by feature size and energy consumption. The Reconfigurable Optical Computer (ROC) is one such innovative, Post-Moore’s Law processor. ROC is designed to solve partial differential equations in one shot as opposed to existing solutions, which are based on costly iterative computations. This is achieved by leveraging physical properties of a mesh of optical components that behave similarly to electrical resistances. However, building large photonic arrays to accommodate arbitrarily large problems is not yet feasible. It is also possible to have problems that are smaller than the size of the accelerator array. In both cases, virtualization is necessary. In this work, we introduce an architecture and methodology for light-weight virtualization of ROC. We show that overhead from virtualization is minimal, and our experimental results show two orders of magnitude increased speed as compared to microprocessor execution while keeping errors due to virtualization under 10%.","PeriodicalId":320580,"journal":{"name":"2020 International Conference on Rebooting Computing (ICRC)","volume":"375 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131895363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rebooting Neuromorphic Design - A Complexity Engineering Approach","authors":"N. Ganesh","doi":"10.1109/ICRC2020.2020.00012","DOIUrl":"https://doi.org/10.1109/ICRC2020.2020.00012","url":null,"abstract":"As the compute demands for machine learning and artificial intelligence applications continue to grow, neuromorphic hardware has been touted as a potential solution. New emerging devices like memristors, spintronics, atomic switches, etc have shown tremendous potential to replace CMOS-based circuits but have been hindered by multiple challenges with respect to device variability, stochastic behavior and scalability. In this paper we will introduce a Description ↔ Design framework to analyze past successes in computing, understand current problems and identify a path moving forward. Engineering systems with these emerging devices might require the modification of both the type of descriptions of learning that we will design for, and the design methodologies we employ in order to realize these new descriptions. We will explore ideas from complexity engineering and analyze the advantages and challenges they offer over traditional approaches to neuromorphic design with novel computing fabrics. A reservoir computing example is used to understand the specific changes that would accompany in moving towards a complexity engineering approach. The time is ideal for a fundamental rethink of our design methodologies and success will represent a significant shift in how neuromorphic hardware is designed and pave the way for a new paradigm.","PeriodicalId":320580,"journal":{"name":"2020 International Conference on Rebooting Computing (ICRC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129615005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Pedroni, S. Deiss, Nishant Mysore, G. Cauwenberghs
{"title":"Design Principles of Large-Scale Neuromorphic Systems Centered on High Bandwidth Memory","authors":"B. Pedroni, S. Deiss, Nishant Mysore, G. Cauwenberghs","doi":"10.1109/ICRC2020.2020.00013","DOIUrl":"https://doi.org/10.1109/ICRC2020.2020.00013","url":null,"abstract":"In order for neuromorphic computing to attain full throughput capacity, its hardware design must mitigate any inefficiencies that result from limited bandwidth to neural and synaptic information. In large-scale neuromorphic systems, synaptic memory access is typically the defining bottleneck, demanding that system design closely analyze the interdependence between the functional blocks to keep the memory as active as possible. In this paper, we formulate principles in memory organization of digital spiking neural networks, with a focus on systems with High Bandwidth Memory (HBM) as their bulk memory element. We present some of the fundamental steps and considerations required when designing a highly efficient HBM-centric system, and describe parallelization and pipelining solutions which serve as a foundational architecture for streamlined operation in any multi-port memory system. In our experiments using the Xilinx VU37P FPGA, we demonstrate random, short burst-length memory read bandwidths in excess of 400 GBps (95% relative to sequential-access peak bandwidth), supporting dynamically reconfigurable sparse synaptic connectivity. Therefore, the combination of our proposed network model with practical results suggest a promising path towards implementing highly parallel large-scale neuromorphic systems centered on HBM.","PeriodicalId":320580,"journal":{"name":"2020 International Conference on Rebooting Computing (ICRC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132433630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}