{"title":"2024 Index IEEE Journal on Exploratory Solid-State Computational Devices and Circuits Vol. 10","authors":"","doi":"10.1109/JXCDC.2025.3531616","DOIUrl":"https://doi.org/10.1109/JXCDC.2025.3531616","url":null,"abstract":"","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"187-194"},"PeriodicalIF":2.0,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10845029","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"INFORMATION FOR AUTHORS","authors":"","doi":"10.1109/JXCDC.2024.3499819","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3499819","url":null,"abstract":"","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"C3-C3"},"PeriodicalIF":2.0,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10844024","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Special Topic on 3-D Logic and Memory for Energy Efficient Computing","authors":"editorial","doi":"10.1109/JXCDC.2024.3518312","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3518312","url":null,"abstract":"","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"iii-iv"},"PeriodicalIF":2.0,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10832462","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142938157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"E-MAC: Enhanced In-SRAM MAC Accuracy via Digital-to-Time Modulation","authors":"Saeed Seyedfaraji;Salar Shakibhamedan;Amire Seyedfaraji;Baset Mesgari;Nima Taherinejad;Axel Jantsch;Semeen Rehman","doi":"10.1109/JXCDC.2024.3518633","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3518633","url":null,"abstract":"In this article, we introduce a novel technique called E-multiplication and accumulation (MAC) (EMAC), aimed at enhancing energy efficiency, reducing latency, and improving the accuracy of analog-based in-static random access memory (SRAM) MAC accelerators. Our approach involves a digital-to-time word-line (WL) modulation technique that encodes the WL voltage while preserving the necessary linear voltage drop for precise computations. This eliminates the need for an additional digital-to-analog converter (DAC) in the design. Furthermore, the SRAM-based logical weight encoding scheme we present reduces the reliance on capacitance-based techniques, which typically introduce area overhead in the circuit. This approach ensures consistent voltage drops for all equivalent cases [i.e., \u0000<inline-formula> <tex-math>$(a { times} b) = (b times a)$ </tex-math></inline-formula>\u0000], addressing a persistent issue in existing state-of-the-art methods. Compared with state-of-the-art analog-based in-SRAM techniques, our E-MAC approach demonstrates significant energy savings (\u0000<inline-formula> <tex-math>$1.89times $ </tex-math></inline-formula>\u0000) and improved accuracy (73.25%) per MAC computation from a 1-V power supply, while achieving a \u0000<inline-formula> <tex-math>$11.84times $ </tex-math></inline-formula>\u0000 energy efficiency improvement over baseline digital approaches. Our application analysis shows a marginal overall reduction in accuracy, i.e., a 0.1% and 0.17% reduction for LeNet5-based CNN and VGG16, respectively, when trained on the MNIST and ImageNet datasets.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"178-186"},"PeriodicalIF":2.0,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10804123","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samantha Lubaba Noor;Xuan Wu;Dennis Lin;Pol van Dorpe;Francky Catthoor;Patrick Reynaert;Azad Naeemi
{"title":"Evaluation of a Plasmon-Based Optical Integrated Circuit for Error-Tolerant Streaming Applications","authors":"Samantha Lubaba Noor;Xuan Wu;Dennis Lin;Pol van Dorpe;Francky Catthoor;Patrick Reynaert;Azad Naeemi","doi":"10.1109/JXCDC.2024.3510684","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3510684","url":null,"abstract":"In this work, we have designed and modeled an integrated plasmonic computing module, which operates at 200 GHz clock frequency for high-end streaming algorithm applications. Our work includes designing the individual optical components (modulator, logic gate, and photodetector) and high-speed electronic driver circuits and integrating the components considering their interactions. We have also holistically evaluated the system-level performance of the computing module, taking into account various factors such as power consumption, operational speed, physical footprint, and average temperature. Through rigorous numerical analyses, we have found that with the existing technology and available materials, the plasmonic computing module can best achieve a bit-error-ratio (BER) of \u0000<inline-formula> <tex-math>$10^{-1}$ </tex-math></inline-formula>\u0000. The performance can be improved by using a high electrooptic coefficient material in the phase shifter and increasing the driver circuit’s swing to greater than 1 V.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"170-177"},"PeriodicalIF":2.0,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10777494","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ferroelectric Transistor-Based Synaptic Crossbar Arrays: The Impact of Ferroelectric Thickness and Device-Circuit Interactions","authors":"Chunguang Wang;Sumeet Kumar Gupta","doi":"10.1109/JXCDC.2024.3502053","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3502053","url":null,"abstract":"Ferroelectric transistors (FeFETs)-based crossbar arrays have shown immense promise for computing-in-memory (CiM) architectures targeted for neural accelerator designs. Offering CMOS compatibility, nonvolatility, compact bit cell, and CiM-amenable features, such as multilevel storage and voltage-driven conductance tuning, FeFETs are among the foremost candidates for synaptic devices. However, device and circuit nonideal attributes in FeFETs-based crossbar arrays cause the output currents to deviate from the expected value, which can induce error in CiM of matrix-vector multiplications (MVMs). In this article, we analyze the impact of ferroelectric thickness (\u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000) and cross-layer interactions in FeFETs-based synaptic crossbar arrays accounting for device-circuit nonidealities. First, based on a physics-based model of multidomain FeFETs calibrated to experiments, we analyze the impact of \u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000 on the characteristics of FeFETs as synaptic devices, highlighting the connections between the multidomain physics and the synaptic attributes. Based on this analysis, we investigate the impact of \u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000 in conjunction with other design parameters, such as number of bits stored per device (bit slice), wordline (WL) activation schemes, and FeFETs width on the error probability, area, energy, and latency of CiM at the array level. Our results show that FeFETs with \u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000 around 7 nm achieve the highest CiM robustness, while FeFETs with \u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000 around 10 nm offer the lowest CiM energy and latency. While the CiM robustness for bit slice 2 is less than bit slice 1, its robustness can be brought to a target level via additional design techniques, such as partial wordline activation and optimization of FeFETs width.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"144-152"},"PeriodicalIF":2.0,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10756727","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142798031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Keming Fan;Ashkan Moradifirouzabadi;Xiangjin Wu;Zheyu Li;Flavio Ponzina;Anton Persson;Eric Pop;Tajana Rosing;Mingu Kang
{"title":"SpecPCM: A Low-Power PCM-Based In-Memory Computing Accelerator for Full-Stack Mass Spectrometry Analysis","authors":"Keming Fan;Ashkan Moradifirouzabadi;Xiangjin Wu;Zheyu Li;Flavio Ponzina;Anton Persson;Eric Pop;Tajana Rosing;Mingu Kang","doi":"10.1109/JXCDC.2024.3498837","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3498837","url":null,"abstract":"Mass spectrometry (MS) is essential for proteomics and metabolomics but faces impending challenges in efficiently processing the vast volumes of data. This article introduces SpecPCM, an in-memory computing (IMC) accelerator designed to achieve substantial improvements in energy and delay efficiency for both MS spectral clustering and database (DB) search. SpecPCM employs analog processing with low-voltage swing and utilizes recently introduced phase change memory (PCM) devices based on superlattice materials, optimized for low-voltage and low-power programming. Our approach integrates contributions across multiple levels: application, algorithm, circuit, device, and instruction sets. We leverage a robust hyperdimensional computing (HD) algorithm with a novel dimension-packing method and develop specialized hardware for the end-to-end MS pipeline to overcome the nonideal behavior of PCM devices. We further optimize multilevel PCM devices for different tasks by using different materials. We also perform a comprehensive design exploration to improve energy and delay efficiency while maintaining accuracy, exploring various combinations of hardware and software parameters controlled by the instruction set architecture (ISA). SpecPCM, with up to three bits per cell, achieves speedups of up to \u0000<inline-formula> <tex-math>$82times $ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$143times $ </tex-math></inline-formula>\u0000 for MS clustering and DB search tasks, respectively, along with a four-orders-of-magnitude improvement in energy efficiency compared with state-of-the-art (SoA) CPU/GPU tools.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"161-169"},"PeriodicalIF":2.0,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10753646","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142859023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giacomo Pedretti;John Moon;Pedro Bruel;Sergey Serebryakov;Ron M. Roth;Luca Buonanno;Archit Gajjar;Lei Zhao;Tobias Ziegler;Cong Xu;Martin Foltin;Paolo Faraboschi;Jim Ignowski;Catherine E. Graves
{"title":"X-TIME: Accelerating Large Tree Ensembles Inference for Tabular Data With Analog CAMs","authors":"Giacomo Pedretti;John Moon;Pedro Bruel;Sergey Serebryakov;Ron M. Roth;Luca Buonanno;Archit Gajjar;Lei Zhao;Tobias Ziegler;Cong Xu;Martin Foltin;Paolo Faraboschi;Jim Ignowski;Catherine E. Graves","doi":"10.1109/JXCDC.2024.3495634","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3495634","url":null,"abstract":"Structured, or tabular, data are the most common format in data science. While deep learning models have proven formidable in learning from unstructured data such as images or speech, they are less accurate than simpler approaches when learning from tabular data. In contrast, modern tree-based machine learning (ML) models shine in extracting relevant information from structured data. An essential requirement in data science is to reduce model inference latency in cases where, for example, models are used in a closed loop with simulation to accelerate scientific discovery. However, the hardware acceleration community has mostly focused on deep neural networks and largely ignored other forms of ML. Previous work has described the use of an analog content addressable memory (CAM) component for efficiently mapping random forests (RFs). In this work, we develop an analog-digital architecture that implements a novel increased precision analog CAM and a programmable chip for inference of state-of-the-art tree-based ML models, such as eXtreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost), and others. Thanks to hardware-aware training, X-TIME reaches state-of-the-art accuracy and \u0000<inline-formula> <tex-math>$119times $ </tex-math></inline-formula>\u0000 higher throughput at \u0000<inline-formula> <tex-math>$9740times $ </tex-math></inline-formula>\u0000 lower latency with \u0000<inline-formula> <tex-math>${gt }150times $ </tex-math></inline-formula>\u0000 improved energy efficiency compared with a state-of-the-art GPU for models with up to 4096 trees and depth of 8, with a 19-W peak power consumption.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"116-124"},"PeriodicalIF":2.0,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10753423","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142777850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christian Simonides;Dominik Gausepohl;Peter M. Hinkel;Fabian Seiler;Nima Taherinejad
{"title":"Approximated 2-Bit Adders for Parallel In-Memristor Computing With a Novel Sum-of-Product Architecture","authors":"Christian Simonides;Dominik Gausepohl;Peter M. Hinkel;Fabian Seiler;Nima Taherinejad","doi":"10.1109/JXCDC.2024.3497720","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3497720","url":null,"abstract":"Conventional computing methods struggle with the exponentially increasing demand for computational power, caused by applications including image processing and machine learning (ML). Novel computing paradigms such as in-memory computing (IMC) and approximate computing (AxC) provide promising solutions to this problem. Due to their low energy consumption and inherent ability to store data in a nonvolatile fashion, memristors are an increasingly popular choice in these fields. There is a wide range of logic forms compatible with memristive IMC, each offering different advantages. We present a novel mixed-logic solution that utilizes properties of the sum-of-product (SOP) representation and propose a full-adder circuit that works efficiently in 2-bit units. To further improve the speed, area usage, and energy consumption, we propose two additional approximate (Ax) 2-bit adders that exhibit inherent parallelization capabilities. We apply the proposed adders in selected image processing applications, where our Ax approach reduces the energy consumption by \u0000<inline-formula> <tex-math>$mathrm {31~!%}$ </tex-math></inline-formula>\u0000–\u0000<inline-formula> <tex-math>$mathrm {40~!%}$ </tex-math></inline-formula>\u0000 and improves the speed by \u0000<inline-formula> <tex-math>$mathrm {50~!%}$ </tex-math></inline-formula>\u0000. To demonstrate the potential gains of our approximations in more complex applications, we applied them in ML. Our experiments indicate that with up to \u0000<inline-formula> <tex-math>$6/16$ </tex-math></inline-formula>\u0000 Ax adders, there is no accuracy degradation when applied in a convolutional neural network (CNN) that is evaluated on MNIST. Our approach can save up to 125.6 mJ of energy and 505 million steps compared to our exact approach.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"135-143"},"PeriodicalIF":2.0,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10752571","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142798030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}