{"title":"RM-NTT:一种基于ram的内存中计算数论转换加速器","authors":"Yongmo Park;Ziyu Wang;Sangmin Yoo;Wei D. Lu","doi":"10.1109/JXCDC.2022.3202517","DOIUrl":null,"url":null,"abstract":"As more cloud computing resources are used for machine learning training and inference processes, privacy-preserving techniques that protect data from revealing at the cloud platforms attract increasing interest. Homomorphic encryption (HE) is one of the most promising techniques that enable privacy-preserving machine learning because HE allows data to be evaluated under encrypted forms. However, deep neural network (DNN) implementations using HE are orders of magnitude slower than plaintext implementations. The use of very long polynomials and associated number theoretic transform (NTT) operations for polynomial multiplications is the main bottlenecks of HE implementation for practical uses. This article introduces RRAM number theoretic transform (RM-NTT): a resistive random access memory (RRAM)-based compute-in-memory (CIM) system to accelerate NTT and inverse NTT (INTT) operations. Instead of running fast Fourier transform (FFT)-like algorithms, RM-NTT uses a vector-matrix multiplication (VMM) approach to achieve maximal parallelism during NTT and INTT operations. To improve the efficiency, RM-NTT stores modified forms of the twiddle factors in the RRAM arrays to process NTT/INTT in the same RRAM array and employs a Montgomery reduction algorithm to convert the VMM results. The proposed optimization methods allow RM-NTT to significantly reduce NTT operation latency compared with other NTT accelerators, including both CIM and non-CIM-based designs. The effects of different RM-NTT design parameters and device nonidealities are also discussed.","PeriodicalId":54149,"journal":{"name":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2022-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/6570653/9969523/09870678.pdf","citationCount":"3","resultStr":"{\"title\":\"RM-NTT: An RRAM-Based Compute-in-Memory Number Theoretic Transform Accelerator\",\"authors\":\"Yongmo Park;Ziyu Wang;Sangmin Yoo;Wei D. Lu\",\"doi\":\"10.1109/JXCDC.2022.3202517\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As more cloud computing resources are used for machine learning training and inference processes, privacy-preserving techniques that protect data from revealing at the cloud platforms attract increasing interest. Homomorphic encryption (HE) is one of the most promising techniques that enable privacy-preserving machine learning because HE allows data to be evaluated under encrypted forms. However, deep neural network (DNN) implementations using HE are orders of magnitude slower than plaintext implementations. The use of very long polynomials and associated number theoretic transform (NTT) operations for polynomial multiplications is the main bottlenecks of HE implementation for practical uses. This article introduces RRAM number theoretic transform (RM-NTT): a resistive random access memory (RRAM)-based compute-in-memory (CIM) system to accelerate NTT and inverse NTT (INTT) operations. Instead of running fast Fourier transform (FFT)-like algorithms, RM-NTT uses a vector-matrix multiplication (VMM) approach to achieve maximal parallelism during NTT and INTT operations. To improve the efficiency, RM-NTT stores modified forms of the twiddle factors in the RRAM arrays to process NTT/INTT in the same RRAM array and employs a Montgomery reduction algorithm to convert the VMM results. The proposed optimization methods allow RM-NTT to significantly reduce NTT operation latency compared with other NTT accelerators, including both CIM and non-CIM-based designs. The effects of different RM-NTT design parameters and device nonidealities are also discussed.\",\"PeriodicalId\":54149,\"journal\":{\"name\":\"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2022-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/iel7/6570653/9969523/09870678.pdf\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/9870678/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal on Exploratory Solid-State Computational Devices and Circuits","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/9870678/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
RM-NTT: An RRAM-Based Compute-in-Memory Number Theoretic Transform Accelerator
As more cloud computing resources are used for machine learning training and inference processes, privacy-preserving techniques that protect data from revealing at the cloud platforms attract increasing interest. Homomorphic encryption (HE) is one of the most promising techniques that enable privacy-preserving machine learning because HE allows data to be evaluated under encrypted forms. However, deep neural network (DNN) implementations using HE are orders of magnitude slower than plaintext implementations. The use of very long polynomials and associated number theoretic transform (NTT) operations for polynomial multiplications is the main bottlenecks of HE implementation for practical uses. This article introduces RRAM number theoretic transform (RM-NTT): a resistive random access memory (RRAM)-based compute-in-memory (CIM) system to accelerate NTT and inverse NTT (INTT) operations. Instead of running fast Fourier transform (FFT)-like algorithms, RM-NTT uses a vector-matrix multiplication (VMM) approach to achieve maximal parallelism during NTT and INTT operations. To improve the efficiency, RM-NTT stores modified forms of the twiddle factors in the RRAM arrays to process NTT/INTT in the same RRAM array and employs a Montgomery reduction algorithm to convert the VMM results. The proposed optimization methods allow RM-NTT to significantly reduce NTT operation latency compared with other NTT accelerators, including both CIM and non-CIM-based designs. The effects of different RM-NTT design parameters and device nonidealities are also discussed.