SAFE: A Scalable Homomorphic Encryption Accelerator for Vertical Federated Learning

IF 2.7 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Zhaohui Chen;Zhen Gu;Yanheng Lu;Xuanle Ren;Ruiguang Zhong;Wen-Jie Lu;Jiansong Zhang;Yichi Zhang;Hanghang Wu;Xiaofu Zheng;Heng Liu;Tingqiang Chu;Cheng Hong;Changzheng Wei;Dimin Niu;Yuan Xie
{"title":"SAFE: A Scalable Homomorphic Encryption Accelerator for Vertical Federated Learning","authors":"Zhaohui Chen;Zhen Gu;Yanheng Lu;Xuanle Ren;Ruiguang Zhong;Wen-Jie Lu;Jiansong Zhang;Yichi Zhang;Hanghang Wu;Xiaofu Zheng;Heng Liu;Tingqiang Chu;Cheng Hong;Changzheng Wei;Dimin Niu;Yuan Xie","doi":"10.1109/TCAD.2024.3496836","DOIUrl":null,"url":null,"abstract":"Privacy preservation has become a critical concern for governments, hospitals, and large corporations. Homomorphic encryption (HE) enables a ciphertext-based computation paradigm with strong security guarantees. In emerging cross-agency data cooperation scenarios like vertical federated learning (VFL), HE protects the data interaction from exposure to counterparts. However, computation on ciphertext has significant performance challenges due to increased data size and substantial overhead. Related work has been proposed to accelerate HE using parallel hardware, such as GPUs, FPGAs, and ASICs. However, many existing hardware accelerators target specific HE operations, such as number theoretic transform (NTT) and key switching, providing limited performance improvement for end-to-end applications. Others support bootstrapping, which requires quite a large ASIC design. To better support existing VFL training applications, we propose SAFE, an HE accelerator for scalable homomorphic matrix-vector products (HMVPs), which is the performance bottleneck. SAFE adopts a coefficient-wise encoded HMVP algorithm, despite a vanilla mode, we further explore the compressed and concatenated modes, which can fully utilize the polynomial encoding slots. The proposed hardware architecture, customized for HMVP dataflow, supports spatial and temporal parallelization of function units. The most costly polynomial function, NTT, is implemented with a low-area constant geometry unit which improves efficiency by <inline-formula> <tex-math>$2.43\\times $ </tex-math></inline-formula>. SAFE is implemented as a CPU-FPGA heterogeneous acceleration system, unleashing the multithread potential. The evaluation demonstrates an up to <inline-formula> <tex-math>$36\\times $ </tex-math></inline-formula> speed-up in end-to-end federated logistic regression training.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 5","pages":"1662-1675"},"PeriodicalIF":2.7000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10750502/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Privacy preservation has become a critical concern for governments, hospitals, and large corporations. Homomorphic encryption (HE) enables a ciphertext-based computation paradigm with strong security guarantees. In emerging cross-agency data cooperation scenarios like vertical federated learning (VFL), HE protects the data interaction from exposure to counterparts. However, computation on ciphertext has significant performance challenges due to increased data size and substantial overhead. Related work has been proposed to accelerate HE using parallel hardware, such as GPUs, FPGAs, and ASICs. However, many existing hardware accelerators target specific HE operations, such as number theoretic transform (NTT) and key switching, providing limited performance improvement for end-to-end applications. Others support bootstrapping, which requires quite a large ASIC design. To better support existing VFL training applications, we propose SAFE, an HE accelerator for scalable homomorphic matrix-vector products (HMVPs), which is the performance bottleneck. SAFE adopts a coefficient-wise encoded HMVP algorithm, despite a vanilla mode, we further explore the compressed and concatenated modes, which can fully utilize the polynomial encoding slots. The proposed hardware architecture, customized for HMVP dataflow, supports spatial and temporal parallelization of function units. The most costly polynomial function, NTT, is implemented with a low-area constant geometry unit which improves efficiency by $2.43\times $ . SAFE is implemented as a CPU-FPGA heterogeneous acceleration system, unleashing the multithread potential. The evaluation demonstrates an up to $36\times $ speed-up in end-to-end federated logistic regression training.
面向垂直联邦学习的可扩展同态加密加速器
隐私保护已经成为政府、医院和大公司关注的关键问题。同态加密(HE)提供了一种基于密文的计算范式,具有很强的安全性保证。在新兴的跨机构数据合作场景(如垂直联邦学习(VFL))中,HE保护数据交互不暴露给对手。然而,由于数据大小的增加和大量的开销,对密文的计算具有显著的性能挑战。相关的工作已经提出,以加速HE使用并行硬件,如gpu, fpga和asic。然而,许多现有的硬件加速器针对特定的HE操作,如数论变换(NTT)和键交换,为端到端应用程序提供有限的性能改进。其他支持自引导,这需要相当大的ASIC设计。为了更好地支持现有的VFL训练应用,我们提出了SAFE,一个可扩展同态矩阵矢量产品(HMVPs)的HE加速器,这是性能瓶颈。SAFE采用了系数型编码的HMVP算法,尽管是一种普通模式,但我们进一步探索了压缩和连接模式,可以充分利用多项式编码槽。所提出的硬件架构,为HMVP数据流定制,支持功能单元的空间和时间并行化。最昂贵的多项式函数NTT采用低面积常数几何单元实现,效率提高了2.43倍。SAFE作为一个CPU-FPGA异构加速系统实现,释放了多线程的潜力。评估表明,端到端联邦逻辑回归训练的速度提高了36倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.60
自引率
13.80%
发文量
500
审稿时长
7 months
期刊介绍: The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信