忆阻纵横制阵列中容错权值映射的有效分组方法

Dev Narayan Yadav , Phrangboklang Lyngton Thangkhiew , Sandip Chakraborty , Indranil Sengupta
{"title":"忆阻纵横制阵列中容错权值映射的有效分组方法","authors":"Dev Narayan Yadav ,&nbsp;Phrangboklang Lyngton Thangkhiew ,&nbsp;Sandip Chakraborty ,&nbsp;Indranil Sengupta","doi":"10.1016/j.memori.2023.100045","DOIUrl":null,"url":null,"abstract":"<div><p>The ability of resistive memory (ReRAM) to naturally conduct vector–matrix multiplication (VMM), which is the primary operation carried out during the training and inference of neural networks, has caught the interest of researchers. The memristor crossbar is one of the desirable architectures to perform VMM because it offers various benefits over other memory technologies, including in-memory computing, low power, and high density. Direct downloading and chip-on-the-loop approaches are typically used to train ReRAM-based neural networks. In these methods, all weight computations are carried out by a host machine, and the computed weights are downloaded in the crossbar. It has been seen that the network does not deliver the same precision as promised by the host system once the weights have been downloaded. This is because crossbars contain a significant number of faulty memristors and suffer from cell resistance variations because of immature manufacturing technologies. As a result, a cell may not be able to take the exact weight values that the host system generates, and may lead to incorrect inferences. Existing techniques for fault-tolerant mapping either involve network retraining or employ a graph-matching strategy that comes with hardware, power, and latency overheads. In this paper, we propose a mapping method to tolerate the effect of defective memristors. In order to lessen the impact of faulty memristors, the mapping is done in a way that allows network weights to cover up faulty memristors. Further, this work prioritizes the different faults based on the frequency of occurrence. The mapping efficiency is found to increase significantly with low power, area and latency overheads in the proposed approach. Experimental analyses show considerable improvement as compared to state-of-the-art works.</p></div>","PeriodicalId":100915,"journal":{"name":"Memories - Materials, Devices, Circuits and Systems","volume":"4 ","pages":"Article 100045"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient grouping approach for fault tolerant weight mapping in memristive crossbar array\",\"authors\":\"Dev Narayan Yadav ,&nbsp;Phrangboklang Lyngton Thangkhiew ,&nbsp;Sandip Chakraborty ,&nbsp;Indranil Sengupta\",\"doi\":\"10.1016/j.memori.2023.100045\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The ability of resistive memory (ReRAM) to naturally conduct vector–matrix multiplication (VMM), which is the primary operation carried out during the training and inference of neural networks, has caught the interest of researchers. The memristor crossbar is one of the desirable architectures to perform VMM because it offers various benefits over other memory technologies, including in-memory computing, low power, and high density. Direct downloading and chip-on-the-loop approaches are typically used to train ReRAM-based neural networks. In these methods, all weight computations are carried out by a host machine, and the computed weights are downloaded in the crossbar. It has been seen that the network does not deliver the same precision as promised by the host system once the weights have been downloaded. This is because crossbars contain a significant number of faulty memristors and suffer from cell resistance variations because of immature manufacturing technologies. As a result, a cell may not be able to take the exact weight values that the host system generates, and may lead to incorrect inferences. Existing techniques for fault-tolerant mapping either involve network retraining or employ a graph-matching strategy that comes with hardware, power, and latency overheads. In this paper, we propose a mapping method to tolerate the effect of defective memristors. In order to lessen the impact of faulty memristors, the mapping is done in a way that allows network weights to cover up faulty memristors. Further, this work prioritizes the different faults based on the frequency of occurrence. The mapping efficiency is found to increase significantly with low power, area and latency overheads in the proposed approach. Experimental analyses show considerable improvement as compared to state-of-the-art works.</p></div>\",\"PeriodicalId\":100915,\"journal\":{\"name\":\"Memories - Materials, Devices, Circuits and Systems\",\"volume\":\"4 \",\"pages\":\"Article 100045\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Memories - Materials, Devices, Circuits and Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2773064623000221\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Memories - Materials, Devices, Circuits and Systems","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2773064623000221","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

电阻存储器(ReRAM)自然进行向量-矩阵乘法(VMM)的能力引起了研究人员的兴趣,这是神经网络训练和推理过程中的主要操作。忆阻器交叉开关是执行VMM的理想架构之一,因为它提供了优于其他存储器技术的各种优点,包括内存内计算、低功耗和高密度。直接下载和芯片在环方法通常用于训练基于ReRAM的神经网络。在这些方法中,所有的权重计算都由主机执行,并且计算的权重被下载到纵横制中。已经看到,一旦下载了权重,网络就不能提供与主机系统承诺的精度相同的精度。这是因为横杆包含大量有故障的忆阻器,并且由于制造技术不成熟而导致单元电阻变化。因此,细胞可能无法获得宿主系统生成的确切权重值,并可能导致错误的推断。现有的容错映射技术要么涉及网络再训练,要么采用伴随硬件、电源和延迟开销的图匹配策略。在本文中,我们提出了一种映射方法来容忍缺陷忆阻器的影响。为了减少故障忆阻器的影响,映射是以允许网络权重掩盖故障忆阻的方式进行的。此外,这项工作根据发生频率对不同的故障进行优先级排序。发现在所提出的方法中,映射效率在低功率、低面积和低延迟开销的情况下显著提高。实验分析表明,与最先进的工作相比,有了相当大的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Efficient grouping approach for fault tolerant weight mapping in memristive crossbar array

The ability of resistive memory (ReRAM) to naturally conduct vector–matrix multiplication (VMM), which is the primary operation carried out during the training and inference of neural networks, has caught the interest of researchers. The memristor crossbar is one of the desirable architectures to perform VMM because it offers various benefits over other memory technologies, including in-memory computing, low power, and high density. Direct downloading and chip-on-the-loop approaches are typically used to train ReRAM-based neural networks. In these methods, all weight computations are carried out by a host machine, and the computed weights are downloaded in the crossbar. It has been seen that the network does not deliver the same precision as promised by the host system once the weights have been downloaded. This is because crossbars contain a significant number of faulty memristors and suffer from cell resistance variations because of immature manufacturing technologies. As a result, a cell may not be able to take the exact weight values that the host system generates, and may lead to incorrect inferences. Existing techniques for fault-tolerant mapping either involve network retraining or employ a graph-matching strategy that comes with hardware, power, and latency overheads. In this paper, we propose a mapping method to tolerate the effect of defective memristors. In order to lessen the impact of faulty memristors, the mapping is done in a way that allows network weights to cover up faulty memristors. Further, this work prioritizes the different faults based on the frequency of occurrence. The mapping efficiency is found to increase significantly with low power, area and latency overheads in the proposed approach. Experimental analyses show considerable improvement as compared to state-of-the-art works.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信