{"title":"Improving DNN Accuracy on MLC PIM via Non-Ideal PIM Device Fine-Tuning","authors":"Hao Lv;Lei Zhang;Ying Wang","doi":"10.1109/TCAD.2024.3521195","DOIUrl":null,"url":null,"abstract":"Resistive random access memory (RRAM) emerges as a promising technology for developing energy-efficient deep neural network (DNN) accelerators, owing to its analog computing paradigm for matrix-vector multiplication. However, the inherent nonideal device features of RRAM cells, such as device variation, read disturbances, and limited on/off ratio, present challenges for model deployment. Therefore, to ensure accurate storage and computing precision for RRAM-based accelerators, a widely used practice is encoding a DNN weight by multiple cells, resulting in significant memory overhead and underutilization. This challenge is further exacerbated by the rapid increases in model size witnessed in recent years. While the one-to-one weight-cell mapping strategy can improve memory utilization, it inevitably introduces deviations in the mapped DNN weight from the desired value due to RRAM variation issues, leading to model accuracy degradation. In response to this challenge, we abstract the model optimization on RRAM chips as a non-ideal PIM device optimization problem, aimed at optimizing model accuracy without the requirement of precise weight programming. We systematically analyze the model optimization behavior on multilevel RRAM devices by investigating the accuracy recovery process of various fine-tuning strategies in recovering model performance under the non-ideal PIM device setting. Based on the analysis, we propose a non-ideal PIM device finetune scheme to recover the model performance for multilevel RRAM under the non-ideal PIM device setting. Our proposed scheme leverages knowledge distillation and exploits input/output information of the model on RRAM to guide the fine-tuning process, finally restoring its accuracy. Experimental results demonstrate the efficacy of our non-ideal PIM device fine-tuning scheme, achieving nearly complete recovery of model performance. Our approach yields over a 3% improvement in model accuracy compared to variation-aware training approaches.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 6","pages":"2277-2286"},"PeriodicalIF":2.7000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10811964/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Resistive random access memory (RRAM) emerges as a promising technology for developing energy-efficient deep neural network (DNN) accelerators, owing to its analog computing paradigm for matrix-vector multiplication. However, the inherent nonideal device features of RRAM cells, such as device variation, read disturbances, and limited on/off ratio, present challenges for model deployment. Therefore, to ensure accurate storage and computing precision for RRAM-based accelerators, a widely used practice is encoding a DNN weight by multiple cells, resulting in significant memory overhead and underutilization. This challenge is further exacerbated by the rapid increases in model size witnessed in recent years. While the one-to-one weight-cell mapping strategy can improve memory utilization, it inevitably introduces deviations in the mapped DNN weight from the desired value due to RRAM variation issues, leading to model accuracy degradation. In response to this challenge, we abstract the model optimization on RRAM chips as a non-ideal PIM device optimization problem, aimed at optimizing model accuracy without the requirement of precise weight programming. We systematically analyze the model optimization behavior on multilevel RRAM devices by investigating the accuracy recovery process of various fine-tuning strategies in recovering model performance under the non-ideal PIM device setting. Based on the analysis, we propose a non-ideal PIM device finetune scheme to recover the model performance for multilevel RRAM under the non-ideal PIM device setting. Our proposed scheme leverages knowledge distillation and exploits input/output information of the model on RRAM to guide the fine-tuning process, finally restoring its accuracy. Experimental results demonstrate the efficacy of our non-ideal PIM device fine-tuning scheme, achieving nearly complete recovery of model performance. Our approach yields over a 3% improvement in model accuracy compared to variation-aware training approaches.
期刊介绍:
The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.