Optimized operation scheme of flash-memory-based neural network online training with ultra-high endurance

IF 4.8 4区 物理与天体物理 Q2 PHYSICS, CONDENSED MATTER
Yang Feng, Zhaohui Sun, Yueran Qi, Xuepeng Zhan, Junyu Zhang, Jing Liu, Masaharu Kobayashi, Jixuan Wu, Jiezhi Chen
{"title":"Optimized operation scheme of flash-memory-based neural network online training with ultra-high endurance","authors":"Yang Feng, Zhaohui Sun, Yueran Qi, Xuepeng Zhan, Junyu Zhang, Jing Liu, Masaharu Kobayashi, Jixuan Wu, Jiezhi Chen","doi":"10.1088/1674-4926/45/1/012301","DOIUrl":null,"url":null,"abstract":"With the rapid development of machine learning, the demand for high-efficient computing becomes more and more urgent. To break the bottleneck of the traditional Von Neumann architecture, computing-in-memory (CIM) has attracted increasing attention in recent years. In this work, to provide a feasible CIM solution for the large-scale neural networks (NN) requiring continuous weight updating in online training, a flash-based computing-in-memory with high endurance (10<sup>9</sup> cycles) and ultra-fast programming speed is investigated. On the one hand, the proposed programming scheme of channel hot electron injection (CHEI) and hot hole injection (HHI) demonstrate high linearity, symmetric potentiation, and a depression process, which help to improve the training speed and accuracy. On the other hand, the low-damage programming scheme and memory window (MW) optimizations can suppress cell degradation effectively with improved computing accuracy. Even after 10<sup>9</sup> cycles, the leakage current (<italic toggle=\"yes\">I</italic>\n<sub>off</sub>) of cells remains sub-10pA, ensuring the large-scale computing ability of memory. Further characterizations are done on read disturb to demonstrate its robust reliabilities. By processing CIFAR-10 tasks, it is evident that ~90% accuracy can be achieved after 10<sup>9</sup> cycles in both ResNet50 and VGG16 NN. Our results suggest that flash-based CIM has great potential to overcome the limitations of traditional Von Neumann architectures and enable high-performance NN online training, which pave the way for further development of artificial intelligence (AI) accelerators.","PeriodicalId":17038,"journal":{"name":"Journal of Semiconductors","volume":"70 1","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Semiconductors","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/1674-4926/45/1/012301","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, CONDENSED MATTER","Score":null,"Total":0}
引用次数: 0

Abstract

With the rapid development of machine learning, the demand for high-efficient computing becomes more and more urgent. To break the bottleneck of the traditional Von Neumann architecture, computing-in-memory (CIM) has attracted increasing attention in recent years. In this work, to provide a feasible CIM solution for the large-scale neural networks (NN) requiring continuous weight updating in online training, a flash-based computing-in-memory with high endurance (109 cycles) and ultra-fast programming speed is investigated. On the one hand, the proposed programming scheme of channel hot electron injection (CHEI) and hot hole injection (HHI) demonstrate high linearity, symmetric potentiation, and a depression process, which help to improve the training speed and accuracy. On the other hand, the low-damage programming scheme and memory window (MW) optimizations can suppress cell degradation effectively with improved computing accuracy. Even after 109 cycles, the leakage current (I off) of cells remains sub-10pA, ensuring the large-scale computing ability of memory. Further characterizations are done on read disturb to demonstrate its robust reliabilities. By processing CIFAR-10 tasks, it is evident that ~90% accuracy can be achieved after 109 cycles in both ResNet50 and VGG16 NN. Our results suggest that flash-based CIM has great potential to overcome the limitations of traditional Von Neumann architectures and enable high-performance NN online training, which pave the way for further development of artificial intelligence (AI) accelerators.
基于闪存的超高耐久性神经网络在线训练的优化运行方案
随着机器学习的快速发展,人们对高效计算的需求越来越迫切。为了突破传统冯-诺依曼架构的瓶颈,内存计算(CIM)近年来越来越受到关注。本研究针对在线训练中需要持续更新权重的大规模神经网络(NN)提供了一种可行的 CIM 解决方案,研究了一种具有高耐用性(109 个周期)和超快编程速度的基于闪存的内存计算。一方面,所提出的通道热电子注入(CHEI)和热空穴注入(HHI)编程方案具有高线性度、对称电位和抑制过程,有助于提高训练速度和准确性。另一方面,低损伤编程方案和存储器窗口(MW)优化可以有效抑制单元退化,提高计算精度。即使经过 109 次循环,单元的漏电流(Ioff)也保持在 10pA 以下,确保了存储器的大规模运算能力。我们还对读取干扰进行了进一步的鉴定,以证明其强大的可靠性。通过处理 CIFAR-10 任务,可以明显看出,在 ResNet50 和 VGG16 NN 中,经过 109 次循环后,准确率可达到约 90%。我们的研究结果表明,基于闪存的 CIM 在克服传统冯-诺依曼架构的局限性和实现高性能 NN 在线训练方面具有巨大潜力,这为人工智能(AI)加速器的进一步发展铺平了道路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Semiconductors
Journal of Semiconductors PHYSICS, CONDENSED MATTER-
CiteScore
6.70
自引率
9.80%
发文量
119
期刊介绍: Journal of Semiconductors publishes articles that emphasize semiconductor physics, materials, devices, circuits, and related technology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信