{"title":"Single-Pass: An Operation Unit-Based In-Memory Computing Architecture for Sparse Neural Networks","authors":"Shang Wang;Qi Cao;Yongqiang Wang;Hang Chen;Zhenjiao Chen;Feng Liang","doi":"10.1109/TCAD.2025.3539592","DOIUrl":null,"url":null,"abstract":"Compute-in-memory (CIM) has emerged as a prominent research focus in recent years, offering a promising alternative for advancing traditional von Neumann architecture computers. However, the extensive array structures and peripheral circuits inherent in CIM introduce challenges related to latency and power consumption. The operation unit (OU) has gained attention as a practical solution to these issues, significantly transforming the computational paradigm of in-memory computing. Despite its potential, the possibilities enabled by this approach remain underexplored. This article presents a novel architecture, single-pass, designed around OU implementation with a new OU partitioning method optimized for sparse networks. Additionally, we propose a matrix compression technique leveraging a dual heuristic greedy algorithm (DHGA), forming the foundation of our architecture-specific mapping strategy. Experimental results demonstrate that, within given area constraints, our architecture achieves an average energy efficiency improvement of 29.8% and a speedup of 82.3% across various networks compared to the baseline.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 8","pages":"2952-2965"},"PeriodicalIF":2.7000,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10876165/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Compute-in-memory (CIM) has emerged as a prominent research focus in recent years, offering a promising alternative for advancing traditional von Neumann architecture computers. However, the extensive array structures and peripheral circuits inherent in CIM introduce challenges related to latency and power consumption. The operation unit (OU) has gained attention as a practical solution to these issues, significantly transforming the computational paradigm of in-memory computing. Despite its potential, the possibilities enabled by this approach remain underexplored. This article presents a novel architecture, single-pass, designed around OU implementation with a new OU partitioning method optimized for sparse networks. Additionally, we propose a matrix compression technique leveraging a dual heuristic greedy algorithm (DHGA), forming the foundation of our architecture-specific mapping strategy. Experimental results demonstrate that, within given area constraints, our architecture achieves an average energy efficiency improvement of 29.8% and a speedup of 82.3% across various networks compared to the baseline.
期刊介绍:
The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.