RM-NTT: An RRAM-Based Compute-in-Memory Number Theoretic Transform Accelerator

IF 2 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Yongmo Park;Ziyu Wang;Sangmin Yoo;Wei D. Lu
{"title":"RM-NTT: An RRAM-Based Compute-in-Memory Number Theoretic Transform Accelerator","authors":"Yongmo Park;Ziyu Wang;Sangmin Yoo;Wei D. Lu","doi":"10.1109/JXCDC.2022.3202517","DOIUrl":null,"url":null,"abstract":"As more cloud computing resources are used for machine learning training and inference processes, privacy-preserving techniques that protect data from revealing at the cloud platforms attract increasing interest. Homomorphic encryption (HE) is one of the most promising techniques that enable privacy-preserving machine learning because HE allows data to be evaluated under encrypted forms. However, deep neural network (DNN) implementations using HE are orders of magnitude slower than plaintext implementations. The use of very long polynomials and associated number theoretic transform (NTT) operations for polynomial multiplications is the main bottlenecks of HE implementation for practical uses. This article introduces RRAM number theoretic transform (RM-NTT): a resistive random access memory (RRAM)-based compute-in-memory (CIM) system to accelerate NTT and inverse NTT (INTT) operations. Instead of running fast Fourier transform (FFT)-like algorithms, RM-NTT uses a vector-matrix multiplication (VMM) approach to achieve maximal parallelism during NTT and INTT operations. To improve the efficiency, RM-NTT stores modified forms of the twiddle factors in the RRAM arrays to process NTT/INTT in the same RRAM array and employs a Montgomery reduction algorithm to convert the VMM results. The proposed optimization methods allow RM-NTT to significantly reduce NTT operation latency compared with other NTT accelerators, including both CIM and non-CIM-based designs. The effects of different RM-NTT design parameters and device nonidealities are also discussed.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2022-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/9969523/09870678.pdf","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/9870678/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 3

Abstract

As more cloud computing resources are used for machine learning training and inference processes, privacy-preserving techniques that protect data from revealing at the cloud platforms attract increasing interest. Homomorphic encryption (HE) is one of the most promising techniques that enable privacy-preserving machine learning because HE allows data to be evaluated under encrypted forms. However, deep neural network (DNN) implementations using HE are orders of magnitude slower than plaintext implementations. The use of very long polynomials and associated number theoretic transform (NTT) operations for polynomial multiplications is the main bottlenecks of HE implementation for practical uses. This article introduces RRAM number theoretic transform (RM-NTT): a resistive random access memory (RRAM)-based compute-in-memory (CIM) system to accelerate NTT and inverse NTT (INTT) operations. Instead of running fast Fourier transform (FFT)-like algorithms, RM-NTT uses a vector-matrix multiplication (VMM) approach to achieve maximal parallelism during NTT and INTT operations. To improve the efficiency, RM-NTT stores modified forms of the twiddle factors in the RRAM arrays to process NTT/INTT in the same RRAM array and employs a Montgomery reduction algorithm to convert the VMM results. The proposed optimization methods allow RM-NTT to significantly reduce NTT operation latency compared with other NTT accelerators, including both CIM and non-CIM-based designs. The effects of different RM-NTT design parameters and device nonidealities are also discussed.
RM-NTT:一种基于ram的内存中计算数论转换加速器
随着越来越多的云计算资源用于机器学习训练和推理过程,保护数据在云平台上不被泄露的隐私保护技术吸引了越来越多的兴趣。同态加密(HE)是实现保护隐私的机器学习的最有前途的技术之一,因为HE允许在加密形式下对数据进行评估。然而,使用HE的深度神经网络(DNN)实现比明文实现要慢几个数量级。使用超长多项式和相关的数论变换(NTT)运算进行多项式乘法是实际应用中HE实现的主要瓶颈。本文介绍了RRAM数论变换(RM-NTT):一种基于电阻式随机存取存储器(RRAM)的内存中计算(CIM)系统,可以加速NTT和逆NTT运算。RM-NTT不是运行快速的傅里叶变换(FFT)算法,而是使用向量矩阵乘法(VMM)方法在NTT和INTT操作期间实现最大的并行性。为了提高效率,RM-NTT在RRAM数组中存储修改后的旋转因子形式,以便在同一RRAM数组中处理NTT/INTT,并使用Montgomery约简算法转换VMM结果。与其他NTT加速器(包括基于CIM和非CIM的设计)相比,所提出的优化方法允许RM-NTT显著降低NTT操作延迟。讨论了不同的RM-NTT设计参数和器件非理想性的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.00
自引率
4.20%
发文量
11
审稿时长
13 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信