Fang-Yi Gu;Ing-Chao Lin;Bing Li;Ulf Schlichtmann;Grace Li Zhang
{"title":"Efficient Model Switching in RRAM-Based DNN Accelerators","authors":"Fang-Yi Gu;Ing-Chao Lin;Bing Li;Ulf Schlichtmann;Grace Li Zhang","doi":"10.1109/TCAD.2025.3550403","DOIUrl":null,"url":null,"abstract":"Resistive random access memory (RRAM) has emerged as a promising technology for deep neural network (DNN) accelerators, but programming every weight in a DNN onto RRAM cells for inference can be both time-consuming and energy-intensive, especially when switching between different DNN models. This article introduces a hardware-aware multimodel merging (HA3M) framework designed to minimize the need for reprogramming by maximizing weight reuse, while taking into account the hardware constraints of the accelerator. The framework includes three key approaches: 1) crossbar (XB)-aware model mapping (XAMM); 2) block-based layer matching (BLM); and 3) multimodel retraining (MMR). XAMM reduces the XB usage of the preprogrammed model on RRAM XBs while preserving the model’s structure. BLM reuses preprogrammed weights in a block-based manner, ensuring the inference process remains unchanged. MMR then equalizes the block-based matched weights across multiple models. Experimental results show that the proposed framework significantly reduces programming cycles in multi-DNN switching scenarios while maintaining or even enhancing accuracy, and eliminating the need for reprogramming.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3738-3751"},"PeriodicalIF":2.9000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10922769/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Resistive random access memory (RRAM) has emerged as a promising technology for deep neural network (DNN) accelerators, but programming every weight in a DNN onto RRAM cells for inference can be both time-consuming and energy-intensive, especially when switching between different DNN models. This article introduces a hardware-aware multimodel merging (HA3M) framework designed to minimize the need for reprogramming by maximizing weight reuse, while taking into account the hardware constraints of the accelerator. The framework includes three key approaches: 1) crossbar (XB)-aware model mapping (XAMM); 2) block-based layer matching (BLM); and 3) multimodel retraining (MMR). XAMM reduces the XB usage of the preprogrammed model on RRAM XBs while preserving the model’s structure. BLM reuses preprogrammed weights in a block-based manner, ensuring the inference process remains unchanged. MMR then equalizes the block-based matched weights across multiple models. Experimental results show that the proposed framework significantly reduces programming cycles in multi-DNN switching scenarios while maintaining or even enhancing accuracy, and eliminating the need for reprogramming.
期刊介绍:
The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.