Improving DNN Accuracy on MLC PIM via Non-Ideal PIM Device Fine-Tuning

IF 2.7 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Hao Lv;Lei Zhang;Ying Wang
{"title":"Improving DNN Accuracy on MLC PIM via Non-Ideal PIM Device Fine-Tuning","authors":"Hao Lv;Lei Zhang;Ying Wang","doi":"10.1109/TCAD.2024.3521195","DOIUrl":null,"url":null,"abstract":"Resistive random access memory (RRAM) emerges as a promising technology for developing energy-efficient deep neural network (DNN) accelerators, owing to its analog computing paradigm for matrix-vector multiplication. However, the inherent nonideal device features of RRAM cells, such as device variation, read disturbances, and limited on/off ratio, present challenges for model deployment. Therefore, to ensure accurate storage and computing precision for RRAM-based accelerators, a widely used practice is encoding a DNN weight by multiple cells, resulting in significant memory overhead and underutilization. This challenge is further exacerbated by the rapid increases in model size witnessed in recent years. While the one-to-one weight-cell mapping strategy can improve memory utilization, it inevitably introduces deviations in the mapped DNN weight from the desired value due to RRAM variation issues, leading to model accuracy degradation. In response to this challenge, we abstract the model optimization on RRAM chips as a non-ideal PIM device optimization problem, aimed at optimizing model accuracy without the requirement of precise weight programming. We systematically analyze the model optimization behavior on multilevel RRAM devices by investigating the accuracy recovery process of various fine-tuning strategies in recovering model performance under the non-ideal PIM device setting. Based on the analysis, we propose a non-ideal PIM device finetune scheme to recover the model performance for multilevel RRAM under the non-ideal PIM device setting. Our proposed scheme leverages knowledge distillation and exploits input/output information of the model on RRAM to guide the fine-tuning process, finally restoring its accuracy. Experimental results demonstrate the efficacy of our non-ideal PIM device fine-tuning scheme, achieving nearly complete recovery of model performance. Our approach yields over a 3% improvement in model accuracy compared to variation-aware training approaches.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 6","pages":"2277-2286"},"PeriodicalIF":2.7000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10811964/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Resistive random access memory (RRAM) emerges as a promising technology for developing energy-efficient deep neural network (DNN) accelerators, owing to its analog computing paradigm for matrix-vector multiplication. However, the inherent nonideal device features of RRAM cells, such as device variation, read disturbances, and limited on/off ratio, present challenges for model deployment. Therefore, to ensure accurate storage and computing precision for RRAM-based accelerators, a widely used practice is encoding a DNN weight by multiple cells, resulting in significant memory overhead and underutilization. This challenge is further exacerbated by the rapid increases in model size witnessed in recent years. While the one-to-one weight-cell mapping strategy can improve memory utilization, it inevitably introduces deviations in the mapped DNN weight from the desired value due to RRAM variation issues, leading to model accuracy degradation. In response to this challenge, we abstract the model optimization on RRAM chips as a non-ideal PIM device optimization problem, aimed at optimizing model accuracy without the requirement of precise weight programming. We systematically analyze the model optimization behavior on multilevel RRAM devices by investigating the accuracy recovery process of various fine-tuning strategies in recovering model performance under the non-ideal PIM device setting. Based on the analysis, we propose a non-ideal PIM device finetune scheme to recover the model performance for multilevel RRAM under the non-ideal PIM device setting. Our proposed scheme leverages knowledge distillation and exploits input/output information of the model on RRAM to guide the fine-tuning process, finally restoring its accuracy. Experimental results demonstrate the efficacy of our non-ideal PIM device fine-tuning scheme, achieving nearly complete recovery of model performance. Our approach yields over a 3% improvement in model accuracy compared to variation-aware training approaches.
通过非理想PIM器件微调提高MLC PIM的深度神经网络精度
电阻式随机存取存储器(RRAM)由于其矩阵-向量乘法的模拟计算范式而成为开发节能深度神经网络(DNN)加速器的一种有前途的技术。然而,RRAM单元固有的非理想器件特性,如器件变化、读取干扰和有限的开/关比,给模型部署带来了挑战。因此,为了确保基于rram的加速器的准确存储和计算精度,一种广泛使用的做法是用多个单元编码DNN权重,这导致了显著的内存开销和利用率不足。近年来模型尺寸的迅速增加进一步加剧了这一挑战。虽然一对一权重单元映射策略可以提高内存利用率,但由于RRAM变化问题,不可避免地会导致映射的DNN权重偏离期望值,从而导致模型精度下降。针对这一挑战,我们将RRAM芯片上的模型优化问题抽象为非理想PIM器件优化问题,目的是在不需要精确权值编程的情况下优化模型精度。通过研究在非理想PIM器件设置下各种微调策略在恢复模型性能时的精度恢复过程,系统地分析了多层RRAM器件上的模型优化行为。在此基础上,我们提出了一种非理想PIM器件微调方案,以恢复非理想PIM器件设置下多电平RRAM的模型性能。我们提出的方案利用知识蒸馏,利用RRAM上模型的输入/输出信息来指导微调过程,最终恢复其准确性。实验结果证明了我们的非理想PIM器件微调方案的有效性,实现了模型性能的几乎完全恢复。与变化感知训练方法相比,我们的方法在模型精度方面提高了3%以上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.60
自引率
13.80%
发文量
500
审稿时长
7 months
期刊介绍: The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信