Fast Permutation Architecture on Encrypted Data for Secure Neural Network Inference

Xiao Hu, Jing Tian, Zhongfeng Wang
{"title":"Fast Permutation Architecture on Encrypted Data for Secure Neural Network Inference","authors":"Xiao Hu, Jing Tian, Zhongfeng Wang","doi":"10.1109/APCCAS50809.2020.9301698","DOIUrl":null,"url":null,"abstract":"Recently, the secure neural network inference, an organic combination of the homomorphic encryption (HE) and the deep neural network (DNN), has attracted much attention. Nevertheless, the large number computations, brought by the HE scheme, form the bottleneck for real-time applications. A significant portion of the network is the permutation (Perm), which is mainly made up of the number theoretic transform (NTT). In this paper, for the first time, we propose an efficient architecture for the Perm by incorporating algorithmic transformations and architectural level optimizations. First, the core butterfly unit (BU) of NTT is optimized, which reduces the multiplication operations by about 30% compared with the original BU. Then, based on the optimization, a highly parallelized architecture is devised for the Perm. The operations in different modules are well managed by a merging strategy to balance the data path and reduce the memory access. The proposed architecture is synthesized under the TSMC 28-nm CMOS technology. The experimental results show that for the ciphertext size of 2048×60 bits, the proposed design achieves a 7.54x speedup compared to the implementation on an Intel(R) Core(TM) i7-6850K 3.60Hz CPU. Moreover, we apply eight Perm engines to the 1D convolution, which shows a 17.25x speedup over the software implementation.","PeriodicalId":127075,"journal":{"name":"2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APCCAS50809.2020.9301698","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Recently, the secure neural network inference, an organic combination of the homomorphic encryption (HE) and the deep neural network (DNN), has attracted much attention. Nevertheless, the large number computations, brought by the HE scheme, form the bottleneck for real-time applications. A significant portion of the network is the permutation (Perm), which is mainly made up of the number theoretic transform (NTT). In this paper, for the first time, we propose an efficient architecture for the Perm by incorporating algorithmic transformations and architectural level optimizations. First, the core butterfly unit (BU) of NTT is optimized, which reduces the multiplication operations by about 30% compared with the original BU. Then, based on the optimization, a highly parallelized architecture is devised for the Perm. The operations in different modules are well managed by a merging strategy to balance the data path and reduce the memory access. The proposed architecture is synthesized under the TSMC 28-nm CMOS technology. The experimental results show that for the ciphertext size of 2048×60 bits, the proposed design achieves a 7.54x speedup compared to the implementation on an Intel(R) Core(TM) i7-6850K 3.60Hz CPU. Moreover, we apply eight Perm engines to the 1D convolution, which shows a 17.25x speedup over the software implementation.
用于安全神经网络推理的加密数据快速置换体系结构
近年来,安全神经网络推理作为同态加密(HE)和深度神经网络(DNN)的有机结合受到了广泛的关注。然而,HE方案带来的大量计算量成为实时应用的瓶颈。排列(Perm)是网络的重要组成部分,它主要由数论变换(NTT)组成。在本文中,我们首次通过结合算法转换和架构级优化,为Perm提出了一个高效的架构。首先,对NTT核心蝴蝶单元(BU)进行了优化,与原来的BU相比,减少了约30%的乘法运算。在此基础上,设计了一种高度并行化的Perm架构,通过合并策略对不同模块间的操作进行管理,平衡数据路径,减少内存访问。该架构是在TSMC 28纳米CMOS技术下合成的。实验结果表明,对于2048×60位的密文大小,与在Intel(R) Core(TM) i7-6850K 3.60Hz CPU上实现相比,所提出的设计实现了7.54倍的加速。此外,我们对1D卷积应用了8个Perm引擎,其速度比软件实现提高了17.25倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信