S. Russek, C. Donnelly, M. Schneider, B. Baek, M. Pufall, W. Rippard, P. Hopkins, P. Dresselhaus, S. Benz
{"title":"Stochastic single flux quantum neuromorphic computing using magnetically tunable Josephson junctions","authors":"S. Russek, C. Donnelly, M. Schneider, B. Baek, M. Pufall, W. Rippard, P. Hopkins, P. Dresselhaus, S. Benz","doi":"10.1109/ICRC.2016.7738712","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738712","url":null,"abstract":"Single flux quantum (SFQ) circuits form a natural neuromorphic technology with SFQ pulses and superconducting transmission lines simulating action potentials and axons, respectively. Here we present a new component, magnetic Josephson junctions, that have a tunablility and re-configurability that was lacking from previous SFQ neuromorphic circuits. The nanoscale magnetic structure acts as a tunable synaptic constituent that modifies the junction critical current. These circuits can operate near the thermal limit where stochastic firing of the neurons is an essential component of the technology. This technology has the ability to create complex neural systems with greater than 1021 neural firings per second with approximately 1 W dissipation.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123217287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Challenges for optical interconnect for beyond Moore's law computing","authors":"A. Lentine, C. DeRose","doi":"10.1109/ICRC.2016.7738696","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738696","url":null,"abstract":"We describe the challenge of implementing optical interconnect for beyond Moore's electronic devices. In particular, we developed a simple link model and calculated the optical communications energy for logic voltages down to 10 mV. The results of this link model show a limit to the minimum communications energy that depends on the achievable extinction ratio of the devices. This work gives some insight into the tact that should be taken for improved optical devices to have an impact in future computing systems using ultra-low voltage transistor devices.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125238727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"All-optical neuromorphic computing in optical networks of semiconductor lasers","authors":"D. Brunner, S. Reitzenstein, Ingo Fischer","doi":"10.1109/ICRC.2016.7738705","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738705","url":null,"abstract":"Networks of interconnected nodes are at the heart of every neural network concept. While neural networks have been implemented in various hardware systems, the efficient realization of such networks still represents a major challenge. We demonstrate the implementation of an all-optical network scheme based on holographic coupling and induce complex spatio-temporal transients with Gigahertz bandwidth. Our scheme illustrates the potential of such all-optical systems for future neural network implementations.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130384907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rethinking operating systems for rebooted computing","authors":"Phil Laplante, D. Milojicic","doi":"10.1109/ICRC.2016.7738695","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738695","url":null,"abstract":"as the deceleration of processor scaling due to Moore's law accelerates research in new types of computing structures, the need arises for rethinking operating systems paradigms. Traditionally, an operating system is a layer between hardware and applications and its primary function is in managing hardware resources and providing a common abstraction to applications. How does this function apply, however, to new types of computing paradigms? Are operating systems even needed for these new structures? This paper revisits operating system functionality for new computing paradigms. The structure of these new computers is uncertain as there are many possibilities such as neuromorphic, bio-inspired, adiabatic, reversible, approximate, quantum, combinations of these and others unforeseen [1]. We do know, however, that whatever these new computers will be, there will be some need to manage their resources, to provide programming support, to partition, scale, and connect them and to deal with (partial) failure, along with other traditional operating system's functionality. There might also be some new functionality, such as creating abstract control loops, reasoning about precision, new ways of reconfiguring, and more. We strongly believe that even if traditional operating systems functionality evolves, that the need for operating systems will remain in the new era of computing.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122113598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A path toward ultra-low-energy computing","authors":"E. Debenedictis, M. Frank, N. Ganesh, N. Anderson","doi":"10.1109/ICRC.2016.7738677","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738677","url":null,"abstract":"At roughly kT energy dissipation per operation, the thermodynamic energy efficiency “limits” of Moore's Law were unimaginably far off in the 1960s. However, current computers operate at only 100-10,000 times this limit, forming an argument that historical rates of efficiency scaling must soon slow. This paper reviews the justification for the ~kT per operation limit in the context of processors for von Neumann-class computer architectures of the 1960s. We then reapply the fundamental arguments to contemporary applications and identify a new direction for future computing in which the ultimate efficiency limits would be much further out. New nanodevices with high-level functions that aggregate the functionality of several logic gates and some local memory may be the right building blocks for much more energy efficient execution of emerging applications-such as neural networks.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123320478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Information-theoretic limits of algorithmic noise tolerance","authors":"Daewon Seo, L. Varshney","doi":"10.1109/ICRC.2016.7738715","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738715","url":null,"abstract":"Statistical error compensation techniques in computing circuits are becoming prevalent, especially as implemented on nanoscale physical substrates. One such technique that has been developed and deployed is algorithmic noise tolerance (ANT), which aggregates information from several computational branches operating at different points along energy-reliability circuit tradeoffs. To understand this practical approach better, it is of interest to develop limit theorems on optimal designs, no matter how much design effort is put in. The purpose of this paper is to develop a fundamental limit for ANT by making an analogy to the CEO problem in multiterminal source coding, extended to the setting with a mixed set of discrete and continuous random variables. Since statistical signal processing and machine learning are key workloads for modern computing, we specifically discuss performance measured according to logarithmic distortion, in addition to mean-squared error. We find the Gaussian CEO problem provides performance bounds for ANT under both kinds of distortion.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134044209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Takuya Okuyama, C. Yoshimura, Masato Hayashi, M. Yamaoka
{"title":"Computing architecture to perform approximated simulated annealing for Ising models","authors":"Takuya Okuyama, C. Yoshimura, Masato Hayashi, M. Yamaoka","doi":"10.1109/ICRC.2016.7738673","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738673","url":null,"abstract":"In the near future, the techniques to solve combinatorial optimization problems will become important in various fields and require large computing power. However, the performance growth of von Neumann architecture will slow down due to the end of semiconductor scaling. To resolve this problem, a computing architecture is proposed that maps the optimization problems to the ground state search of Ising models. The authors implemented the architecture, which finds the ground state by circuit operations inspired by SA, in CMOS circuits. The architecture adopts a modified algorithm using a majority function to simplify circuits. Though the power efficiency can be estimated to be 1800 times higher than that of a CPU, the modification deteriorates solution quality because it breaks the detailed balance condition. This paper presents a computing architecture that performs SA for Ising models approximately. The architecture satisfies the condition by utilizing the fact that the output of the majority voter circuit with stochastically processed inputs approximately behaves in accordance with the Glauber dynamics. Simulations demonstrate that solution quality of the proposed architecture is as good as that of SA. Our architecture can be power-efficient because the rate of increase in the number of transistors is less than 42%.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"2010 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127342653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Erasing logic-memory boundaries in superconductor electronics","authors":"V. Semenov","doi":"10.1109/ICRC.2016.7738711","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738711","url":null,"abstract":"Superconductor electronics holds records for clock frequency and energy efficiency at the chip level, and is also projected to large systems that can absorb the cooling overhead. These advantages have nevertheless only managed to give this technology a back seat to CMOS digital circuits, which offer orders of magnitude more complexity. Furthermore, pursuing the CMOS paradigm with superconductor circuits would bring their clock frequency down to that of CMOS circuits. This makes superconductor technology much more open to risky innovations and even for paradigm changes. In the paper we point out that both (speed and energy efficiency) advantages of superconductor electronics could be preserved due to a unique composition of memory and logic functions of RSFQ cells. We propose to reorganize the original RSFQ cells into a new family of Memory And loGIC (MAGIC) gate/register objects that run arithmetic calculations as well as store results. The new MAGIC objects eliminate the time and energy overheads associated with the conventional transfer of computed data to memory by essentially reducing the transfer distance to zero. The new objects could serve as building blocks for distributed MAGIC-compatible architectures, differing from CMOS-like register files by processing as well as storing data. A simple Logic Unit (LU) would be sufficient to control the MAGIC registers, because the registers would provide most of the arithmetic functions and separate ALUs would not be needed. The reduction of the data exchange between logic and memory units leads to additional energy saving. Factorization of large integers is presented as an example illustrating the speed and density advantages of the new approach. The end result will be a superior technology which offers a combination of performance and energy efficiency unattainable by existing technologies or their possible extensions.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128691612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accelerating Discrete Fourier Transforms with dot-product engine","authors":"Miao Hu, J. Strachan","doi":"10.1109/ICRC.2016.7738682","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738682","url":null,"abstract":"Discrete Fourier Transforms (DFT) are extremely useful in signal processing. Usually they are computed with the Fast Fourier Transform (FFT) method as it reduces the computing complexity from O(N2) to O(Nlog(N)). However, FFT is still not powerful enough for many real-time tasks which have stringent requirements on throughput, energy efficiency and cost, such as Internet of Things (IoT). In this paper, we present a solution of computing DFT using the dot-product engine (DPE) - a one transistor one memristor (1T1M) crossbar array with hybrid peripheral circuit support. With this solution, the computing complexity is further reduced to a constant O(λ) independent of the input data size, where λ is the timing ratio of one DPE operation comparing to one real multiplication operation in digital systems.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123429468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
César O. Campos-Aguillón, Rene Celis-Cordova, Ismo Hänninen, C. Lent, A. Orlov, G. Snider
{"title":"A Mini-MIPS microprocessor for adiabatic computing","authors":"César O. Campos-Aguillón, Rene Celis-Cordova, Ismo Hänninen, C. Lent, A. Orlov, G. Snider","doi":"10.1109/ICRC.2016.7738678","DOIUrl":"https://doi.org/10.1109/ICRC.2016.7738678","url":null,"abstract":"This paper examines adiabatic logic for computation, and presents a design for a MIPS processor implemented in CMOS. Adiabatic reversible logic was examined in the 1980s and 1990s but in that era power dissipation was a secondary concern, and the trade-off of reduced computational speed for reduced power was deemed unacceptable. Now, power dissipation and the associated heat are the major obstacles limiting progress in integrated circuits, particularly processors. In modern processors trading performance for reduced power dissipation is already done using techniques such as multi-core and dark silicon, so adiabatic logic may now be an attractive approach. To evaluate the adiabatic approach, this paper uses the figure of merit of the product of switching energy, delay time, and area (EDA). Using this figure of merit, adiabatic logic is shown to be advantageous when additional constraints are considered, such as maximum allowed power density. As a proof of concept circuit, a simplified MIPS microprocessor was designed using adiabatic logic based on split-rail charge recovery logic and Bennett clocking. New design and verification tools were developed using structural Verilog and extensions of ModelSim to provide needed capabilities are not available in commercial packages. The design is implemented using a standard cell design.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133396825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}