{"title":"High-Endurance Bipolar ReRAM-Based Non-Volatile Flip-Flops with Run-Time Tunable Resistive States","authors":"Mehrdad Biglari, T. Lieske, D. Fey","doi":"10.1145/3232195.3232217","DOIUrl":"https://doi.org/10.1145/3232195.3232217","url":null,"abstract":"ReRAM technologies feature desired properties, e.g. fast switching and high read margin, that make them attractive candidates to be used in non-volatile flip-flops (NVFFs). However, they suffer from limited endurance. Therefore, cell degradation considerations are a necessity for practical deployment in non-volatile processors (NVPs). In this paper, we present two bipolar ReRAM-based NVFFs, Hypnos and Morpheus, with enhanced endurance and energy efficiency. Hypnos reduces the ReRAM electrical stress during set operation while keeping the imposed NVFF area overhead at a minimum. In Morpheus, a write-termination circuit is used to further enhance the ReRAM endurance and energy efficiency at the cost of an affordable area overhead. Moreover, both NVFFs feature run-time tunable resistive states to enable on-line adjustment of the trade-off among endurance, retention, energy consumption, and restore success rate (in case of approximate computing). Experimental results demonstrate that Hypnos reduces the ReRAM set degradation by 91%, on average. Moreover, the write-termination mechanism in Morpheus further reduces the remaining degradation by 93%/97% in set/reset operation, on average. The results also demonstrate enhanced energy efficiency in both NVFFs.","PeriodicalId":401010,"journal":{"name":"2018 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122683441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Minimal Disturbed Bits in Writing Resistive Crossbar Memories","authors":"M. Fouda, A. Eltawil, F. Kurdahi","doi":"10.1145/3232195.3232207","DOIUrl":"https://doi.org/10.1145/3232195.3232207","url":null,"abstract":"Resistive memories are promising candidates for non-volatile memories. Write disturb is one of problems that facing this kind of memories. In this paper, the write disturb problem is mathematically formulated in terms of the bias parameters and optimized analytically. A closed form solution for the optimal bias parameters is calculated. Results are compared with the 1/2 and 1/3 bias schemes showing a significant improvement.","PeriodicalId":401010,"journal":{"name":"2018 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127372348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michail Noltsis, Panayiotis Englezakis, E. Maragkoudaki, C. Nicopoulos, D. Rodopoulos, F. Catthoor, Yiannakis Sazeides, Davide Zoni, D. Soudris
{"title":"Fast Estimations of Failure Probability Over Long Time Spans","authors":"Michail Noltsis, Panayiotis Englezakis, E. Maragkoudaki, C. Nicopoulos, D. Rodopoulos, F. Catthoor, Yiannakis Sazeides, Davide Zoni, D. Soudris","doi":"10.1145/3232195.3232198","DOIUrl":"https://doi.org/10.1145/3232195.3232198","url":null,"abstract":"Shrinking of device dimensions has undoubtedly enabled the very large scale integration of transistors on electronic chips. However, it has also brought to surface time-zero and time-dependent variation phenomena that degrade system’s performance and threaten functional operation. Hence, the need to capture and describe these mechanisms, as well as effectively model their impact is crucial. To this extent, we follow existing models and propose a complete framework that evaluates failure probability of electronic components. To assess our framework, a case-study of packet-switched Network on Chip (NoC) routers is presented, studying the failure probability of its SRAM buffers.","PeriodicalId":401010,"journal":{"name":"2018 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114543880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chao Ding, W. Kang, He Zhang, Youguang Zhang, Weisheng Zhao
{"title":"A Novel Cross-point MRAM with Diode Selector Capable of High-Density, High-Speed, and Low-Power In-Memory Computation","authors":"Chao Ding, W. Kang, He Zhang, Youguang Zhang, Weisheng Zhao","doi":"10.1145/3232195.3232218","DOIUrl":"https://doi.org/10.1145/3232195.3232218","url":null,"abstract":"In-Memory Computation (IMC), which is capable of reducing the power consumption and bandwidth requirement resulting from the data transfer between the processing and memory units, has been considered as a promising technology to break the von-Neumann bottleneck. In order to develop an effective and efficient IMC platform, the performance, such as density, operation speed and power consumption, of the memory itself is one of the most important keys. In this work, we report a cross-point magnetic random access memory (MRAM) with diode selector for IMC implementation. The memory cell consists of a magnetic tunnel junction (MTJ) device and a diode connected in series. The memory cells are arranged in a cross-point array structure, providing high storage density. The MTJ can be switched through the unipolar precessional voltage-controlled magnetic anisotropy (VCMA) effect, thus enabling high speed and low power. Further, Boolean logic functions can be realized via regular memory-like write & read operations. The feasibility and performance of the proposed IMC in the cross-point MRAM are successfully demonstrated with hybrid VCMA-MTJ/CMOS circuit simulations under the 40 nm technology node.*","PeriodicalId":401010,"journal":{"name":"2018 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"359 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133454778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Overcoming Crossbar Nonidealities in Binary Neural Networks Through Learning","authors":"M. Fouda, Jongeun Lee, A. Eltawil, F. Kurdahi","doi":"10.1145/3232195.3232226","DOIUrl":"https://doi.org/10.1145/3232195.3232226","url":null,"abstract":"The crossbar nonidealaties may considerably degrade the accuracy of matrix multiplication operation, which is the cornerstone of hardware accelerated neural networks. In this paper, we show that the crossbar nonidealities especially the wire resistance should be taken into consideration for accurate evaluation. We also present a simple yet highly effective way to capture the wire resistance effect for the inference and training of deep neural networks without extensive SPICE simulations. Different scenarios have been studied and used to show the efficacy of our proposed method.","PeriodicalId":401010,"journal":{"name":"2018 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116739789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Catherine E. Graves, W. Ma, X. Sheng, B. Buchanan, Le Zheng, Sity Lam, Xuema Li, S. R. Chalamalasetti, Lennie Kiyama, M. Foltin, J. Strachan, Matthew P. Hardy
{"title":"Regular Expression Matching with Memristor TCAMs for Network Security","authors":"Catherine E. Graves, W. Ma, X. Sheng, B. Buchanan, Le Zheng, Sity Lam, Xuema Li, S. R. Chalamalasetti, Lennie Kiyama, M. Foltin, J. Strachan, Matthew P. Hardy","doi":"10.1145/3232195.3232201","DOIUrl":"https://doi.org/10.1145/3232195.3232201","url":null,"abstract":"We propose using memristor-based TCAMs (Ternary Content Addressable Memory) to accelerate Regular Expression (RegEx) matching. RegEx matching is a key function in network security, where deep packet inspection finds and filters out malicious actors. However, RegEx matching latency and power can be incredibly high and current proposals are challenged to perform wire-speed matching for large scale rulesets. Our approach dramatically decreases RegEx matching operating power, provides high throughput, and the use of mTCAMs enables novel compression techniques to expand ruleset sizes and allows future exploitation of the multi-state (analog) capabilities of memristors. We fabricated and demonstrated nanoscale memristor TCAM cells. SPICE simulations investigate mTCAM performance at scale and a mTCAM power model at 22nm demonstrates 0.2 fJ/bit/search energy for a 36×400 mTCAM. We further propose a tiled architecture which implements a Snort rule-set and assess the application performance. Compared to a state-of-the-art FPGA approach (2 Gbps, −1W), we show ×4 throughput (8 Gbps) at 60% the power (0.62W) before applying standard TCAM power-saving techniques. Our performance comparison improves further when striding (searching multiple characters) is considered, resulting in 47.2 Gbps at 1.3W for our approach compared to 3.9 Gbps at 630mW for the strided FPGA NFA, demonstrating a promising path to wire-speed RegEx matching on large scale rulesets.","PeriodicalId":401010,"journal":{"name":"2018 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132270432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qin Li, Huifeng Zhu, F. Qiao, Qi Wei, Xinjun Liu, Huazhong Yang
{"title":"Energy-efficient MFCC extraction architecture in mixed-signal domain for automatic speech recognition","authors":"Qin Li, Huifeng Zhu, F. Qiao, Qi Wei, Xinjun Liu, Huazhong Yang","doi":"10.1145/3232195.3232219","DOIUrl":"https://doi.org/10.1145/3232195.3232219","url":null,"abstract":"This paper proposes a novel processing architecture to extract Mel-Frequency Cepstrum Coefficients (MFCC) for automatic speech recognition. Inspired by the human ear, the energy-efficient analog-domain information processing is adopted to replace the energy-intensive Fourier Transform in conventional digital-domain. Moreover, the proposed architecture extracts the acoustic features in the mixed-signal domain, which significantly reduces the cost of Analog-to-Digital Converter (ADC) and the computational complexity. We carry out the circuit-level simulation based on 180nm CMOS technology, which shows an energy consumption of 2.4 nJ/frame, and a processing speed of 45.79 μs/frame. The proposed architecture achieves 97.2% energy saving and about 6.4× speedup than state of the art. Speech recognition simulation reaches the classification accuracy of 99% using the proposed MFCC features.","PeriodicalId":401010,"journal":{"name":"2018 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125246186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heinz Riener, Eleonora Testa, L. Amarù, Mathias Soeken, G. Micheli
{"title":"Size Optimization of MIGs with an Application to QCA and STMG Technologies","authors":"Heinz Riener, Eleonora Testa, L. Amarù, Mathias Soeken, G. Micheli","doi":"10.1145/3232195.3232202","DOIUrl":"https://doi.org/10.1145/3232195.3232202","url":null,"abstract":"Majority-inverter graphs (MIGs) are a logic representation with remarkable algebraic and Boolean properties that enable efficient logic optimizations beyond the capabilities of traditional logic representations. Further, since many nano-emerging technologies, such as quantum-dot cellular automata (QCA) or spin torque majority gates (STMG), are inherently majority-based, MIGs serve as a natural logic representation to map into these technologies. So far, MIG optimization methods predominantly target to reduce the depth of the logic networks, corresponding to low delay implementations in the respective technologies. In this paper, we introduce several methods to optimize the size of MIGs. They can be applied such that the depth of the logic network is preserved; therefore our methods have a direct effect on the physical area, without worsening the delay. Some methods are inspired by existing size optimization algorithms for non-majority-based logic networks, others make explicit use of the majority function and its properties. All methods are Boolean—in contrast to algebraic optimization methods—which has a positive effect on the quality but challenges their implementation. Our experiments show that using our methods the size of MIGs in the EPFL combinational benchmark suite can be reduced by up to 7.12%. When mapped to QCA and STMG technologies we reduce the average area-delay-energy product by 2.31% and 2.07%, respectively.","PeriodicalId":401010,"journal":{"name":"2018 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"488 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134197128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Orestis Liolis, Vassilios A. Mardiris, G. Sirakoulis, I. Karafyllidis
{"title":"Qantum-dot Cellular Automata RAM design using Crossbar Architecture","authors":"Orestis Liolis, Vassilios A. Mardiris, G. Sirakoulis, I. Karafyllidis","doi":"10.1145/3232195.3232216","DOIUrl":"https://doi.org/10.1145/3232195.3232216","url":null,"abstract":"In this paper, a new approach of RAM circuits, using Quantum-dot Cellular Automata (QCA), based on programmable crossbar architecture, is presented. In addition, a methodology for 2n bits RAMs is presented. Using the aforementioned methodology any designer can design a RAM regardless of its size. The proposed designs utilize the benefits of QCA programmable crossbar architecture. Namely, the RAM circuit is characterized by regularity and the ability of customization. The features that the proposed RAM design methodology has, allow the designers to use the available circuit area efficiently.","PeriodicalId":401010,"journal":{"name":"2018 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"196 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120863752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Representation of Qubit States using 3D Memristance Spaces: A first step towards a Memristive Quantum Simulator","authors":"I. Karafyllidis, G. Sirakoulis, P. Dimitrakis","doi":"10.1145/3232195.3232197","DOIUrl":"https://doi.org/10.1145/3232195.3232197","url":null,"abstract":"Development of quantum simulators is a major step towards the universal quantum computer. Quantum simulators are quantum systems that can perform specific quantum computations, or software packages that can reproduce most of the aspects of a general universal quantum computer on a general purpose classical computer. Development of quantum simulators using digital circuits, such as FPGAs is very difficult, mainly because the unit of quantum information, the qubit, has an infinite number of states, whereas the classical bit has only two. On the other hand, analog circuits comprising R, L and C elements have no internal state variables that can be used to reproduce and store qubit states. Here we take the first step towards the development of a new quantum simulator using memristors. The qubit state is mapped to a 3D space spanned by the memristances of three identical memristors. The qubit state evolution is reproduced by the input voltages applied to the memr-istors. We define the correspondence between the general qubit state rotation, i.e. the one-qubit quantum gates, and memristor input voltage variations and reproduce the rotations imposed by the action of quantum gates in the 3D memristance space. Our results show that, at least in principle, qubits and one-qubit quantum gates can be simulated by memristors.","PeriodicalId":401010,"journal":{"name":"2018 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121374459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}