Madison Manley;James Read;Ankit Kaul;Shimeng Yu;Muhannad Bakir
{"title":"Co-Optimization of Power Delivery Network Design for 3-D Heterogeneous Integration of RRAM-Based Compute In-Memory Accelerators","authors":"Madison Manley;James Read;Ankit Kaul;Shimeng Yu;Muhannad Bakir","doi":"10.1109/JXCDC.2025.3534560","DOIUrl":"https://doi.org/10.1109/JXCDC.2025.3534560","url":null,"abstract":"Three-dimensional heterogeneous integration (3D-HI) offers promising solutions for incorporating substantial embedded memory into cutting-edge analog compute-in-memory (CIM) AI accelerators, addressing the need for on-chip acceleration of large AI models. However, this approach faces challenges with power supply noise (PSN) margins due to <inline-formula> <tex-math>$V_{text {DD}}$ </tex-math></inline-formula> scaling and increased power delivery network (PDN) impedance. This study demonstrates the necessity and benefits of 3D-HI for large-scale CIM accelerators, where 2-D implementations would exceed manufacturing reticle limits. Our 3-D designs achieve 39% higher energy efficiency, <inline-formula> <tex-math>$8times $ </tex-math></inline-formula> higher operation density, and improved throughput through shorter vertical interconnects. We quantify steady-state IR-drop impacts in 3D-HI CIM architectures using a framework that combines PDN modeling, 3D-HI power, performance, area estimation, and behavioral modeling. We demonstrate that a drop in supply voltage to CIM arrays increases sensitivity to process, voltage, and temperature (PVT) noise. Using our framework, we model IR-drop and simulate its impact on the accuracy of ResNet-50 and ResNet-152 when classifying images from the ImageNet 1k dataset in the presence of injected PVT noise. We analyze the impact of through-silicon via (TSV) design and placement to optimize the IR-drop and classification accuracy. For ResNet architectures in 3-D integration, we demonstrate that peripheral TSV placement provides an optimal balance between interconnect complexity and performance, achieving IR-drop below 10% of <inline-formula> <tex-math>$V_{text {DD}}$ </tex-math></inline-formula> while maintaining high classification accuracy.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"11 ","pages":"10-18"},"PeriodicalIF":2.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10854426","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143379553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"2024 Index IEEE Journal on Exploratory Solid-State Computational Devices and Circuits Vol. 10","authors":"","doi":"10.1109/JXCDC.2025.3531616","DOIUrl":"https://doi.org/10.1109/JXCDC.2025.3531616","url":null,"abstract":"","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"187-194"},"PeriodicalIF":2.0,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10845029","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tzuping Huang;Linran Zhao;Yiming Han;Hai Li;Ian A. Young;Yaoyao Jia
{"title":"MESO-CMOS Hybrid Circuits With Time-Multiplexing Technique for Energy and Area-Efficient Computing in Memory","authors":"Tzuping Huang;Linran Zhao;Yiming Han;Hai Li;Ian A. Young;Yaoyao Jia","doi":"10.1109/JXCDC.2025.3530906","DOIUrl":"https://doi.org/10.1109/JXCDC.2025.3530906","url":null,"abstract":"The magnetoelectric spin orbit (MESO), one of the emerging spin devices, represents a promising alternative to complementary metal-oxide–semiconductor (CMOS) technology. MESO provides dual functionality: each device can perform logic operations while acting as a nonvolatile memory device. MESO also offers advantages, such as an ultralow supply voltage of 100 mV and the potential to vertically integrate with CMOS, which promises significant energy and area efficiency. These features support MESO’s suitability for improving the energy efficiency and area efficiency of computing-in-memory (CIM) circuits. To harness the advantages of MESO in large-scale complex circuit systems, this article presents the development of a MESO-based standard cell library. This library is critical to realize automated design, as it allows the implementation of all the basic CMOS functions with MESO, thereby enabling MESO-CMOS hybrid design in large-scale complex circuits. This article also introduces a highly area-efficient time-multiplexing technique to optimize the complex function inside CIM. Specifically, the multiplier and multiply-and-accumulate (MAC) circuits using the MESO-CMOS hybrid time-multiplexing technique reduce the area by 85% and 81%, respectively, compared to CMOS implementations.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"11 ","pages":"1-9"},"PeriodicalIF":2.0,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843777","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143379466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"INFORMATION FOR AUTHORS","authors":"","doi":"10.1109/JXCDC.2024.3499819","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3499819","url":null,"abstract":"","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"C3-C3"},"PeriodicalIF":2.0,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10844024","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Special Topic on 3-D Logic and Memory for Energy Efficient Computing","authors":"editorial","doi":"10.1109/JXCDC.2024.3518312","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3518312","url":null,"abstract":"","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"iii-iv"},"PeriodicalIF":2.0,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10832462","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142938157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"E-MAC: Enhanced In-SRAM MAC Accuracy via Digital-to-Time Modulation","authors":"Saeed Seyedfaraji;Salar Shakibhamedan;Amire Seyedfaraji;Baset Mesgari;Nima Taherinejad;Axel Jantsch;Semeen Rehman","doi":"10.1109/JXCDC.2024.3518633","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3518633","url":null,"abstract":"In this article, we introduce a novel technique called E-multiplication and accumulation (MAC) (EMAC), aimed at enhancing energy efficiency, reducing latency, and improving the accuracy of analog-based in-static random access memory (SRAM) MAC accelerators. Our approach involves a digital-to-time word-line (WL) modulation technique that encodes the WL voltage while preserving the necessary linear voltage drop for precise computations. This eliminates the need for an additional digital-to-analog converter (DAC) in the design. Furthermore, the SRAM-based logical weight encoding scheme we present reduces the reliance on capacitance-based techniques, which typically introduce area overhead in the circuit. This approach ensures consistent voltage drops for all equivalent cases [i.e., \u0000<inline-formula> <tex-math>$(a { times} b) = (b times a)$ </tex-math></inline-formula>\u0000], addressing a persistent issue in existing state-of-the-art methods. Compared with state-of-the-art analog-based in-SRAM techniques, our E-MAC approach demonstrates significant energy savings (\u0000<inline-formula> <tex-math>$1.89times $ </tex-math></inline-formula>\u0000) and improved accuracy (73.25%) per MAC computation from a 1-V power supply, while achieving a \u0000<inline-formula> <tex-math>$11.84times $ </tex-math></inline-formula>\u0000 energy efficiency improvement over baseline digital approaches. Our application analysis shows a marginal overall reduction in accuracy, i.e., a 0.1% and 0.17% reduction for LeNet5-based CNN and VGG16, respectively, when trained on the MNIST and ImageNet datasets.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"178-186"},"PeriodicalIF":2.0,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10804123","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samantha Lubaba Noor;Xuan Wu;Dennis Lin;Pol van Dorpe;Francky Catthoor;Patrick Reynaert;Azad Naeemi
{"title":"Evaluation of a Plasmon-Based Optical Integrated Circuit for Error-Tolerant Streaming Applications","authors":"Samantha Lubaba Noor;Xuan Wu;Dennis Lin;Pol van Dorpe;Francky Catthoor;Patrick Reynaert;Azad Naeemi","doi":"10.1109/JXCDC.2024.3510684","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3510684","url":null,"abstract":"In this work, we have designed and modeled an integrated plasmonic computing module, which operates at 200 GHz clock frequency for high-end streaming algorithm applications. Our work includes designing the individual optical components (modulator, logic gate, and photodetector) and high-speed electronic driver circuits and integrating the components considering their interactions. We have also holistically evaluated the system-level performance of the computing module, taking into account various factors such as power consumption, operational speed, physical footprint, and average temperature. Through rigorous numerical analyses, we have found that with the existing technology and available materials, the plasmonic computing module can best achieve a bit-error-ratio (BER) of \u0000<inline-formula> <tex-math>$10^{-1}$ </tex-math></inline-formula>\u0000. The performance can be improved by using a high electrooptic coefficient material in the phase shifter and increasing the driver circuit’s swing to greater than 1 V.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"170-177"},"PeriodicalIF":2.0,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10777494","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ferroelectric Transistor-Based Synaptic Crossbar Arrays: The Impact of Ferroelectric Thickness and Device-Circuit Interactions","authors":"Chunguang Wang;Sumeet Kumar Gupta","doi":"10.1109/JXCDC.2024.3502053","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3502053","url":null,"abstract":"Ferroelectric transistors (FeFETs)-based crossbar arrays have shown immense promise for computing-in-memory (CiM) architectures targeted for neural accelerator designs. Offering CMOS compatibility, nonvolatility, compact bit cell, and CiM-amenable features, such as multilevel storage and voltage-driven conductance tuning, FeFETs are among the foremost candidates for synaptic devices. However, device and circuit nonideal attributes in FeFETs-based crossbar arrays cause the output currents to deviate from the expected value, which can induce error in CiM of matrix-vector multiplications (MVMs). In this article, we analyze the impact of ferroelectric thickness (\u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000) and cross-layer interactions in FeFETs-based synaptic crossbar arrays accounting for device-circuit nonidealities. First, based on a physics-based model of multidomain FeFETs calibrated to experiments, we analyze the impact of \u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000 on the characteristics of FeFETs as synaptic devices, highlighting the connections between the multidomain physics and the synaptic attributes. Based on this analysis, we investigate the impact of \u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000 in conjunction with other design parameters, such as number of bits stored per device (bit slice), wordline (WL) activation schemes, and FeFETs width on the error probability, area, energy, and latency of CiM at the array level. Our results show that FeFETs with \u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000 around 7 nm achieve the highest CiM robustness, while FeFETs with \u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000 around 10 nm offer the lowest CiM energy and latency. While the CiM robustness for bit slice 2 is less than bit slice 1, its robustness can be brought to a target level via additional design techniques, such as partial wordline activation and optimization of FeFETs width.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"144-152"},"PeriodicalIF":2.0,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10756727","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142798031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Keming Fan;Ashkan Moradifirouzabadi;Xiangjin Wu;Zheyu Li;Flavio Ponzina;Anton Persson;Eric Pop;Tajana Rosing;Mingu Kang
{"title":"SpecPCM: A Low-Power PCM-Based In-Memory Computing Accelerator for Full-Stack Mass Spectrometry Analysis","authors":"Keming Fan;Ashkan Moradifirouzabadi;Xiangjin Wu;Zheyu Li;Flavio Ponzina;Anton Persson;Eric Pop;Tajana Rosing;Mingu Kang","doi":"10.1109/JXCDC.2024.3498837","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3498837","url":null,"abstract":"Mass spectrometry (MS) is essential for proteomics and metabolomics but faces impending challenges in efficiently processing the vast volumes of data. This article introduces SpecPCM, an in-memory computing (IMC) accelerator designed to achieve substantial improvements in energy and delay efficiency for both MS spectral clustering and database (DB) search. SpecPCM employs analog processing with low-voltage swing and utilizes recently introduced phase change memory (PCM) devices based on superlattice materials, optimized for low-voltage and low-power programming. Our approach integrates contributions across multiple levels: application, algorithm, circuit, device, and instruction sets. We leverage a robust hyperdimensional computing (HD) algorithm with a novel dimension-packing method and develop specialized hardware for the end-to-end MS pipeline to overcome the nonideal behavior of PCM devices. We further optimize multilevel PCM devices for different tasks by using different materials. We also perform a comprehensive design exploration to improve energy and delay efficiency while maintaining accuracy, exploring various combinations of hardware and software parameters controlled by the instruction set architecture (ISA). SpecPCM, with up to three bits per cell, achieves speedups of up to \u0000<inline-formula> <tex-math>$82times $ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$143times $ </tex-math></inline-formula>\u0000 for MS clustering and DB search tasks, respectively, along with a four-orders-of-magnitude improvement in energy efficiency compared with state-of-the-art (SoA) CPU/GPU tools.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"161-169"},"PeriodicalIF":2.0,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10753646","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142859023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}