MBSNTT:一个高度并行的数字内存位-序列号理论转换加速器

IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Akhil Pakala;Zhiyu Chen;Kaiyuan Yang
{"title":"MBSNTT:一个高度并行的数字内存位-序列号理论转换加速器","authors":"Akhil Pakala;Zhiyu Chen;Kaiyuan Yang","doi":"10.1109/TVLSI.2024.3462955","DOIUrl":null,"url":null,"abstract":"Conventional cryptographic systems protect the data security during communication but give third-party cloud operators complete access to compute decrypted user data. Homomorphic encryption (HE) promises to rectify this and allow computations on encrypted data to be done without actually decrypting it. However, HE encryption requires several orders of magnitude higher latency than conventional encryption schemes. Number theoretic transform (NTT), a polynomial multiplication algorithm, is the bottleneck function in HE. In traditional architectures, memory accesses and support for parallel operations limit NTT’s throughput and energy efficiency. Processing in memory (PIM) is an interesting approach that can maximize parallelism with high-energy efficiency. To enable HE on resource-constrained edge devices, this article presents MBSNTT, a digital in-memory Multi-Bit-Serial NTT accelerator, achieving high parallelism and energy efficiency for NTT with minimized area. MBSNTT features a novel multi-bit-serial modular multiplication algorithm and PIM implementation that computes all modular multiplications in an NTT in parallel. It further adopts a constant geometry NTT data flow for efficient transition between NTT stages and different cores. Our evaluation shows that MBSNTT achieves <inline-formula> <tex-math>$1.62\\times $ </tex-math></inline-formula> (<inline-formula> <tex-math>$19.08\\times $ </tex-math></inline-formula>) higher throughput and <inline-formula> <tex-math>$64.9\\times $ </tex-math></inline-formula> (<inline-formula> <tex-math>$2.06\\times $ </tex-math></inline-formula>) lower energy than state-of-the-art PIM NTT accelerators Crypto-PIM (MeNTT), at a polynomial order of 8 K and bit width of 128.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"537-545"},"PeriodicalIF":2.8000,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MBSNTT: A Highly Parallel Digital In-Memory Bit-Serial Number Theoretic Transform Accelerator\",\"authors\":\"Akhil Pakala;Zhiyu Chen;Kaiyuan Yang\",\"doi\":\"10.1109/TVLSI.2024.3462955\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Conventional cryptographic systems protect the data security during communication but give third-party cloud operators complete access to compute decrypted user data. Homomorphic encryption (HE) promises to rectify this and allow computations on encrypted data to be done without actually decrypting it. However, HE encryption requires several orders of magnitude higher latency than conventional encryption schemes. Number theoretic transform (NTT), a polynomial multiplication algorithm, is the bottleneck function in HE. In traditional architectures, memory accesses and support for parallel operations limit NTT’s throughput and energy efficiency. Processing in memory (PIM) is an interesting approach that can maximize parallelism with high-energy efficiency. To enable HE on resource-constrained edge devices, this article presents MBSNTT, a digital in-memory Multi-Bit-Serial NTT accelerator, achieving high parallelism and energy efficiency for NTT with minimized area. MBSNTT features a novel multi-bit-serial modular multiplication algorithm and PIM implementation that computes all modular multiplications in an NTT in parallel. It further adopts a constant geometry NTT data flow for efficient transition between NTT stages and different cores. Our evaluation shows that MBSNTT achieves <inline-formula> <tex-math>$1.62\\\\times $ </tex-math></inline-formula> (<inline-formula> <tex-math>$19.08\\\\times $ </tex-math></inline-formula>) higher throughput and <inline-formula> <tex-math>$64.9\\\\times $ </tex-math></inline-formula> (<inline-formula> <tex-math>$2.06\\\\times $ </tex-math></inline-formula>) lower energy than state-of-the-art PIM NTT accelerators Crypto-PIM (MeNTT), at a polynomial order of 8 K and bit width of 128.\",\"PeriodicalId\":13425,\"journal\":{\"name\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"volume\":\"33 2\",\"pages\":\"537-545\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10695040/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10695040/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

传统的加密系统在通信过程中保护数据安全,但允许第三方云运营商完全访问计算解密的用户数据。同态加密(HE)有望纠正这一点,并允许在不实际解密的情况下对加密数据进行计算。然而,HE加密需要比传统加密方案高几个数量级的延迟。数论变换(NTT)是一种多项式乘法算法,是高等数学中的瓶颈函数。在传统架构中,内存访问和对并行操作的支持限制了NTT的吞吐量和能源效率。内存处理(PIM)是一种有趣的方法,它可以以高能效最大化并行性。为了在资源受限的边缘设备上启用HE,本文介绍了MBSNTT,一种数字内存中多比特串行NTT加速器,以最小的面积实现了NTT的高并行性和能源效率。MBSNTT具有新颖的多比特串行模块化乘法算法和PIM实现,可以并行计算NTT中的所有模块化乘法。它还采用了恒定几何的NTT数据流,以便在NTT阶段和不同核心之间有效转换。我们的评估表明,MBSNTT的吞吐量比最先进的PIM NTT加速器Crypto-PIM (MeNTT)高1.62倍(19.08倍),能量低64.9倍(2.06倍),多项式阶为8 K,比特宽度为128。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MBSNTT: A Highly Parallel Digital In-Memory Bit-Serial Number Theoretic Transform Accelerator
Conventional cryptographic systems protect the data security during communication but give third-party cloud operators complete access to compute decrypted user data. Homomorphic encryption (HE) promises to rectify this and allow computations on encrypted data to be done without actually decrypting it. However, HE encryption requires several orders of magnitude higher latency than conventional encryption schemes. Number theoretic transform (NTT), a polynomial multiplication algorithm, is the bottleneck function in HE. In traditional architectures, memory accesses and support for parallel operations limit NTT’s throughput and energy efficiency. Processing in memory (PIM) is an interesting approach that can maximize parallelism with high-energy efficiency. To enable HE on resource-constrained edge devices, this article presents MBSNTT, a digital in-memory Multi-Bit-Serial NTT accelerator, achieving high parallelism and energy efficiency for NTT with minimized area. MBSNTT features a novel multi-bit-serial modular multiplication algorithm and PIM implementation that computes all modular multiplications in an NTT in parallel. It further adopts a constant geometry NTT data flow for efficient transition between NTT stages and different cores. Our evaluation shows that MBSNTT achieves $1.62\times $ ( $19.08\times $ ) higher throughput and $64.9\times $ ( $2.06\times $ ) lower energy than state-of-the-art PIM NTT accelerators Crypto-PIM (MeNTT), at a polynomial order of 8 K and bit width of 128.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
6.40
自引率
7.10%
发文量
187
审稿时长
3.6 months
期刊介绍: The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society. Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels. To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信