{"title":"Fast Permutation Architecture on Encrypted Data for Secure Neural Network Inference","authors":"Xiao Hu, Jing Tian, Zhongfeng Wang","doi":"10.1109/APCCAS50809.2020.9301698","DOIUrl":null,"url":null,"abstract":"Recently, the secure neural network inference, an organic combination of the homomorphic encryption (HE) and the deep neural network (DNN), has attracted much attention. Nevertheless, the large number computations, brought by the HE scheme, form the bottleneck for real-time applications. A significant portion of the network is the permutation (Perm), which is mainly made up of the number theoretic transform (NTT). In this paper, for the first time, we propose an efficient architecture for the Perm by incorporating algorithmic transformations and architectural level optimizations. First, the core butterfly unit (BU) of NTT is optimized, which reduces the multiplication operations by about 30% compared with the original BU. Then, based on the optimization, a highly parallelized architecture is devised for the Perm. The operations in different modules are well managed by a merging strategy to balance the data path and reduce the memory access. The proposed architecture is synthesized under the TSMC 28-nm CMOS technology. The experimental results show that for the ciphertext size of 2048×60 bits, the proposed design achieves a 7.54x speedup compared to the implementation on an Intel(R) Core(TM) i7-6850K 3.60Hz CPU. Moreover, we apply eight Perm engines to the 1D convolution, which shows a 17.25x speedup over the software implementation.","PeriodicalId":127075,"journal":{"name":"2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APCCAS50809.2020.9301698","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Recently, the secure neural network inference, an organic combination of the homomorphic encryption (HE) and the deep neural network (DNN), has attracted much attention. Nevertheless, the large number computations, brought by the HE scheme, form the bottleneck for real-time applications. A significant portion of the network is the permutation (Perm), which is mainly made up of the number theoretic transform (NTT). In this paper, for the first time, we propose an efficient architecture for the Perm by incorporating algorithmic transformations and architectural level optimizations. First, the core butterfly unit (BU) of NTT is optimized, which reduces the multiplication operations by about 30% compared with the original BU. Then, based on the optimization, a highly parallelized architecture is devised for the Perm. The operations in different modules are well managed by a merging strategy to balance the data path and reduce the memory access. The proposed architecture is synthesized under the TSMC 28-nm CMOS technology. The experimental results show that for the ciphertext size of 2048×60 bits, the proposed design achieves a 7.54x speedup compared to the implementation on an Intel(R) Core(TM) i7-6850K 3.60Hz CPU. Moreover, we apply eight Perm engines to the 1D convolution, which shows a 17.25x speedup over the software implementation.