{"title":"3-D Digital Compute-in-Memory Benchmark With A5 CFET Technology: An Extension to Lookup-Table-Based Design","authors":"Junmo Lee;Minji Shon;Faaiq Waqar;Shimeng Yu","doi":"10.1109/TVLSI.2025.3566346","DOIUrl":null,"url":null,"abstract":"Digital compute-in-memory (DCIM) has emerged as a promising solution to address scalability and accuracy challenges in analog compute-in-memory (ACIM) for next-generation AI hardware acceleration. In this work, we present a comprehensive device-to-system codesign process for the two proposed 3-D DCIM architectures at the projected 5 angstrom (A5) complementary FET (CFET) technology node: 1) 3-D DCIM based on 8T DCIM bit cell and 2) lookup-table (LUT)-based 3-D DCIM. A novel A5 CFET-based 8T DCIM bit cell (6T SRAM +2T AND gate) is proposed to improve total footprint and latency over the conventional 10T DCIM bit cell, and its functionality is verified through technology computer-aided design (TCAD) simulation. For macro- and system-level evaluation of the proposed 3-D DCIM architectures, an extended NeuroSim V1.4 framework is developed, the first compute-in-memory (CIM) benchmark framework enabling CIM simulation at the A5 CFET technology node. We demonstrate that the proposed 3-D DCIM with 8T DCIM bit cell at the A5 CFET technology node can achieve <inline-formula> <tex-math>$8.2\\times $ </tex-math></inline-formula> improvement in figure of merit (FOM) (=TOPS/W <inline-formula> <tex-math>$\\times $ </tex-math></inline-formula> TOPS/mm<sup>2</sup>) over the state-of-the-art 3-nm FinFET-based DCIM design. The LUT-based 3-D DCIM design is additionally proposed to achieve further power consumption reduction from the 8T DCIM bit-cell-based 3-D DCIM. LUT-based 3-D DCIM achieves a 44% reduction in energy consumption compared to the conventional 10T DCIM bit-cell-based 3-D DCIM. Our findings suggest the significant implications for technology scaling below 1 nm in high-performance DCIM design.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 7","pages":"1910-1919"},"PeriodicalIF":3.1000,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11008711/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Digital compute-in-memory (DCIM) has emerged as a promising solution to address scalability and accuracy challenges in analog compute-in-memory (ACIM) for next-generation AI hardware acceleration. In this work, we present a comprehensive device-to-system codesign process for the two proposed 3-D DCIM architectures at the projected 5 angstrom (A5) complementary FET (CFET) technology node: 1) 3-D DCIM based on 8T DCIM bit cell and 2) lookup-table (LUT)-based 3-D DCIM. A novel A5 CFET-based 8T DCIM bit cell (6T SRAM +2T AND gate) is proposed to improve total footprint and latency over the conventional 10T DCIM bit cell, and its functionality is verified through technology computer-aided design (TCAD) simulation. For macro- and system-level evaluation of the proposed 3-D DCIM architectures, an extended NeuroSim V1.4 framework is developed, the first compute-in-memory (CIM) benchmark framework enabling CIM simulation at the A5 CFET technology node. We demonstrate that the proposed 3-D DCIM with 8T DCIM bit cell at the A5 CFET technology node can achieve $8.2\times $ improvement in figure of merit (FOM) (=TOPS/W $\times $ TOPS/mm2) over the state-of-the-art 3-nm FinFET-based DCIM design. The LUT-based 3-D DCIM design is additionally proposed to achieve further power consumption reduction from the 8T DCIM bit-cell-based 3-D DCIM. LUT-based 3-D DCIM achieves a 44% reduction in energy consumption compared to the conventional 10T DCIM bit-cell-based 3-D DCIM. Our findings suggest the significant implications for technology scaling below 1 nm in high-performance DCIM design.
期刊介绍:
The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society.
Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels.
To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.