{"title":"Device Nonideality-Aware Compute-in-Memory Array Architecting: Direct Voltage Sensing, I–V Symmetric Bitcell, and Padding Array","authors":"Jianzi Jin;Shifan Gao;Cimang Lu;Xiang Qiu;Yi Zhao","doi":"10.1109/JXCDC.2025.3539470","DOIUrl":"https://doi.org/10.1109/JXCDC.2025.3539470","url":null,"abstract":"A voltage sensing compute-in-memory (CIM) architecture has been designed to improve the analog computing accuracy, and a chip on 90-nm flash platform has been successfully fabricated, with the bidirectional operation enabled by the symmetric bitcell structure. By padding the weight sum to a global value for all bit lines (BLs), the costly multiplication postprocessing can be efficiently performed with the analog operation inside the array. The BL-differential voltage output scheme has two unique invariances. First, the so-called scaling invariance allows the weight matrix to be scaled to the full range for every BL. Second, the shifting invariance allows the weight to be tuned to a larger conductance with a better I–V linearity. Combined with the distributed padding, input voltage loss can also be reduced by suppressing the IR drop. The above schemes can significantly improve the linearity and reduce the relative weight error by >50%, as confirmed in applications from MNIST to face recognition, making it a promising solution for advanced artificial intelligence (AI) and memory computing applications.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"11 ","pages":"19-24"},"PeriodicalIF":2.0,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10876176","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143553318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Madison Manley;James Read;Ankit Kaul;Shimeng Yu;Muhannad Bakir
{"title":"Co-Optimization of Power Delivery Network Design for 3-D Heterogeneous Integration of RRAM-Based Compute In-Memory Accelerators","authors":"Madison Manley;James Read;Ankit Kaul;Shimeng Yu;Muhannad Bakir","doi":"10.1109/JXCDC.2025.3534560","DOIUrl":"https://doi.org/10.1109/JXCDC.2025.3534560","url":null,"abstract":"Three-dimensional heterogeneous integration (3D-HI) offers promising solutions for incorporating substantial embedded memory into cutting-edge analog compute-in-memory (CIM) AI accelerators, addressing the need for on-chip acceleration of large AI models. However, this approach faces challenges with power supply noise (PSN) margins due to <inline-formula> <tex-math>$V_{text {DD}}$ </tex-math></inline-formula> scaling and increased power delivery network (PDN) impedance. This study demonstrates the necessity and benefits of 3D-HI for large-scale CIM accelerators, where 2-D implementations would exceed manufacturing reticle limits. Our 3-D designs achieve 39% higher energy efficiency, <inline-formula> <tex-math>$8times $ </tex-math></inline-formula> higher operation density, and improved throughput through shorter vertical interconnects. We quantify steady-state IR-drop impacts in 3D-HI CIM architectures using a framework that combines PDN modeling, 3D-HI power, performance, area estimation, and behavioral modeling. We demonstrate that a drop in supply voltage to CIM arrays increases sensitivity to process, voltage, and temperature (PVT) noise. Using our framework, we model IR-drop and simulate its impact on the accuracy of ResNet-50 and ResNet-152 when classifying images from the ImageNet 1k dataset in the presence of injected PVT noise. We analyze the impact of through-silicon via (TSV) design and placement to optimize the IR-drop and classification accuracy. For ResNet architectures in 3-D integration, we demonstrate that peripheral TSV placement provides an optimal balance between interconnect complexity and performance, achieving IR-drop below 10% of <inline-formula> <tex-math>$V_{text {DD}}$ </tex-math></inline-formula> while maintaining high classification accuracy.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"11 ","pages":"10-18"},"PeriodicalIF":2.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10854426","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143379553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"2024 Index IEEE Journal on Exploratory Solid-State Computational Devices and Circuits Vol. 10","authors":"","doi":"10.1109/JXCDC.2025.3531616","DOIUrl":"https://doi.org/10.1109/JXCDC.2025.3531616","url":null,"abstract":"","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"187-194"},"PeriodicalIF":2.0,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10845029","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"INFORMATION FOR AUTHORS","authors":"","doi":"10.1109/JXCDC.2024.3499819","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3499819","url":null,"abstract":"","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"C3-C3"},"PeriodicalIF":2.0,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10844024","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tzuping Huang;Linran Zhao;Yiming Han;Hai Li;Ian A. Young;Yaoyao Jia
{"title":"MESO-CMOS Hybrid Circuits With Time-Multiplexing Technique for Energy and Area-Efficient Computing in Memory","authors":"Tzuping Huang;Linran Zhao;Yiming Han;Hai Li;Ian A. Young;Yaoyao Jia","doi":"10.1109/JXCDC.2025.3530906","DOIUrl":"https://doi.org/10.1109/JXCDC.2025.3530906","url":null,"abstract":"The magnetoelectric spin orbit (MESO), one of the emerging spin devices, represents a promising alternative to complementary metal-oxide–semiconductor (CMOS) technology. MESO provides dual functionality: each device can perform logic operations while acting as a nonvolatile memory device. MESO also offers advantages, such as an ultralow supply voltage of 100 mV and the potential to vertically integrate with CMOS, which promises significant energy and area efficiency. These features support MESO’s suitability for improving the energy efficiency and area efficiency of computing-in-memory (CIM) circuits. To harness the advantages of MESO in large-scale complex circuit systems, this article presents the development of a MESO-based standard cell library. This library is critical to realize automated design, as it allows the implementation of all the basic CMOS functions with MESO, thereby enabling MESO-CMOS hybrid design in large-scale complex circuits. This article also introduces a highly area-efficient time-multiplexing technique to optimize the complex function inside CIM. Specifically, the multiplier and multiply-and-accumulate (MAC) circuits using the MESO-CMOS hybrid time-multiplexing technique reduce the area by 85% and 81%, respectively, compared to CMOS implementations.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"11 ","pages":"1-9"},"PeriodicalIF":2.0,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843777","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143379466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Special Topic on 3-D Logic and Memory for Energy Efficient Computing","authors":"editorial","doi":"10.1109/JXCDC.2024.3518312","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3518312","url":null,"abstract":"","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"iii-iv"},"PeriodicalIF":2.0,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10832462","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142938157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"E-MAC: Enhanced In-SRAM MAC Accuracy via Digital-to-Time Modulation","authors":"Saeed Seyedfaraji;Salar Shakibhamedan;Amire Seyedfaraji;Baset Mesgari;Nima Taherinejad;Axel Jantsch;Semeen Rehman","doi":"10.1109/JXCDC.2024.3518633","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3518633","url":null,"abstract":"In this article, we introduce a novel technique called E-multiplication and accumulation (MAC) (EMAC), aimed at enhancing energy efficiency, reducing latency, and improving the accuracy of analog-based in-static random access memory (SRAM) MAC accelerators. Our approach involves a digital-to-time word-line (WL) modulation technique that encodes the WL voltage while preserving the necessary linear voltage drop for precise computations. This eliminates the need for an additional digital-to-analog converter (DAC) in the design. Furthermore, the SRAM-based logical weight encoding scheme we present reduces the reliance on capacitance-based techniques, which typically introduce area overhead in the circuit. This approach ensures consistent voltage drops for all equivalent cases [i.e., \u0000<inline-formula> <tex-math>$(a { times} b) = (b times a)$ </tex-math></inline-formula>\u0000], addressing a persistent issue in existing state-of-the-art methods. Compared with state-of-the-art analog-based in-SRAM techniques, our E-MAC approach demonstrates significant energy savings (\u0000<inline-formula> <tex-math>$1.89times $ </tex-math></inline-formula>\u0000) and improved accuracy (73.25%) per MAC computation from a 1-V power supply, while achieving a \u0000<inline-formula> <tex-math>$11.84times $ </tex-math></inline-formula>\u0000 energy efficiency improvement over baseline digital approaches. Our application analysis shows a marginal overall reduction in accuracy, i.e., a 0.1% and 0.17% reduction for LeNet5-based CNN and VGG16, respectively, when trained on the MNIST and ImageNet datasets.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"178-186"},"PeriodicalIF":2.0,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10804123","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samantha Lubaba Noor;Xuan Wu;Dennis Lin;Pol van Dorpe;Francky Catthoor;Patrick Reynaert;Azad Naeemi
{"title":"Evaluation of a Plasmon-Based Optical Integrated Circuit for Error-Tolerant Streaming Applications","authors":"Samantha Lubaba Noor;Xuan Wu;Dennis Lin;Pol van Dorpe;Francky Catthoor;Patrick Reynaert;Azad Naeemi","doi":"10.1109/JXCDC.2024.3510684","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3510684","url":null,"abstract":"In this work, we have designed and modeled an integrated plasmonic computing module, which operates at 200 GHz clock frequency for high-end streaming algorithm applications. Our work includes designing the individual optical components (modulator, logic gate, and photodetector) and high-speed electronic driver circuits and integrating the components considering their interactions. We have also holistically evaluated the system-level performance of the computing module, taking into account various factors such as power consumption, operational speed, physical footprint, and average temperature. Through rigorous numerical analyses, we have found that with the existing technology and available materials, the plasmonic computing module can best achieve a bit-error-ratio (BER) of \u0000<inline-formula> <tex-math>$10^{-1}$ </tex-math></inline-formula>\u0000. The performance can be improved by using a high electrooptic coefficient material in the phase shifter and increasing the driver circuit’s swing to greater than 1 V.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"170-177"},"PeriodicalIF":2.0,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10777494","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ferroelectric Transistor-Based Synaptic Crossbar Arrays: The Impact of Ferroelectric Thickness and Device-Circuit Interactions","authors":"Chunguang Wang;Sumeet Kumar Gupta","doi":"10.1109/JXCDC.2024.3502053","DOIUrl":"https://doi.org/10.1109/JXCDC.2024.3502053","url":null,"abstract":"Ferroelectric transistors (FeFETs)-based crossbar arrays have shown immense promise for computing-in-memory (CiM) architectures targeted for neural accelerator designs. Offering CMOS compatibility, nonvolatility, compact bit cell, and CiM-amenable features, such as multilevel storage and voltage-driven conductance tuning, FeFETs are among the foremost candidates for synaptic devices. However, device and circuit nonideal attributes in FeFETs-based crossbar arrays cause the output currents to deviate from the expected value, which can induce error in CiM of matrix-vector multiplications (MVMs). In this article, we analyze the impact of ferroelectric thickness (\u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000) and cross-layer interactions in FeFETs-based synaptic crossbar arrays accounting for device-circuit nonidealities. First, based on a physics-based model of multidomain FeFETs calibrated to experiments, we analyze the impact of \u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000 on the characteristics of FeFETs as synaptic devices, highlighting the connections between the multidomain physics and the synaptic attributes. Based on this analysis, we investigate the impact of \u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000 in conjunction with other design parameters, such as number of bits stored per device (bit slice), wordline (WL) activation schemes, and FeFETs width on the error probability, area, energy, and latency of CiM at the array level. Our results show that FeFETs with \u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000 around 7 nm achieve the highest CiM robustness, while FeFETs with \u0000<inline-formula> <tex-math>$T_{text {FE}}$ </tex-math></inline-formula>\u0000 around 10 nm offer the lowest CiM energy and latency. While the CiM robustness for bit slice 2 is less than bit slice 1, its robustness can be brought to a target level via additional design techniques, such as partial wordline activation and optimization of FeFETs width.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":"10 ","pages":"144-152"},"PeriodicalIF":2.0,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10756727","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142798031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}