Homomorphic Evaluation Cluster Architecture for Fully Homomorphic Encryption

IF 2.4 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE open journal of circuits and systems Pub Date : 2025-03-08 DOI:10.1109/OJCAS.2025.3568058

Hanyoung Lee;Ardianto Satriawan;Hanho Lee

{"title":"Homomorphic Evaluation Cluster Architecture for Fully Homomorphic Encryption","authors":"Hanyoung Lee;Ardianto Satriawan;Hanho Lee","doi":"10.1109/OJCAS.2025.3568058","DOIUrl":null,"url":null,"abstract":"Fully Homomorphic Encryption (FHE) allows computational processing of encrypted data on cloud servers, providing high security and enabling safe data utilization. As homomorphic multiplication progresses with encrypted data, noise accumulates, requiring a process called bootstrapping to restore the noise level of the new ciphertext <inline-formula> <tex-math>$ct^{\\prime }$ </tex-math></inline-formula>. Bootstrapping involves linear transformation processes, such as Coefficient to Slots and Slots to Coefficient, where most operations used are rotation. Rotation shifts elements in slots to new positions based on rotation index k. However, the computational cost and memory bandwidth required for a rotation adds significant overhead and limits the ability to perform FHE operations. Therefore, an efficient implementation of rotation is crucial for high-performance FHE applications. To address this problem, we optimized the datapath of rotation in the CKKS scheme to be hardware-friendly and proposed a homomorphic evaluation cluster hardware accelerator tailored for FHE workloads. Our architecture is aware of the computational and memory constraints of field programmable gate arrays (FPGAs) and performs number theoretic transform (NTT), its inverse (INTT), key multiplication, base conversion, and automorphism in a single cluster. We implemented our design in the AMD Alveo U280 FPGA platform. With a polynomial length of 216 and operating at 250 MHz as a rotation accelerator, the design implementation on the FPGA shows a speed-up of about <inline-formula> <tex-math>$700\\times $ </tex-math></inline-formula> compared to the CPU implementation in OpenFHE. Compared to the GPU implementation, it shows a <inline-formula> <tex-math>$1.77\\times $ </tex-math></inline-formula> speed-up, and compared to previous FPGA implementations, it shows a <inline-formula> <tex-math>$1.13\\times $ </tex-math></inline-formula> better.","PeriodicalId":93442,"journal":{"name":"IEEE open journal of circuits and systems","volume":"6 ","pages":"135-146"},"PeriodicalIF":2.4000,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10993408","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE open journal of circuits and systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10993408/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Fully Homomorphic Encryption (FHE) allows computational processing of encrypted data on cloud servers, providing high security and enabling safe data utilization. As homomorphic multiplication progresses with encrypted data, noise accumulates, requiring a process called bootstrapping to restore the noise level of the new ciphertext

$ct^{\prime }$

. Bootstrapping involves linear transformation processes, such as Coefficient to Slots and Slots to Coefficient, where most operations used are rotation. Rotation shifts elements in slots to new positions based on rotation index k. However, the computational cost and memory bandwidth required for a rotation adds significant overhead and limits the ability to perform FHE operations. Therefore, an efficient implementation of rotation is crucial for high-performance FHE applications. To address this problem, we optimized the datapath of rotation in the CKKS scheme to be hardware-friendly and proposed a homomorphic evaluation cluster hardware accelerator tailored for FHE workloads. Our architecture is aware of the computational and memory constraints of field programmable gate arrays (FPGAs) and performs number theoretic transform (NTT), its inverse (INTT), key multiplication, base conversion, and automorphism in a single cluster. We implemented our design in the AMD Alveo U280 FPGA platform. With a polynomial length of 216 and operating at 250 MHz as a rotation accelerator, the design implementation on the FPGA shows a speed-up of about

$700\times $

compared to the CPU implementation in OpenFHE. Compared to the GPU implementation, it shows a

$1.77\times $

speed-up, and compared to previous FPGA implementations, it shows a

$1.13\times $

better.

查看原文本刊更多论文

全同态加密的同态评估簇体系结构

完全同态加密（Fully Homomorphic Encryption， FHE）允许在云服务器上对加密数据进行计算处理，提供高安全性，实现安全的数据利用。随着加密数据的同态乘法进行，噪声会累积，需要一个称为bootstrapping的过程来恢复新密文$ct^{\prime}$的噪声水平。自引导涉及线性转换过程，例如系数到槽和槽到系数，其中使用的大多数操作是旋转。旋转根据旋转索引k将槽中的元素移动到新的位置。然而，旋转所需的计算成本和内存带宽增加了显着的开销，并限制了执行FHE操作的能力。因此，旋转的有效实现对于高性能FHE应用至关重要。为了解决这个问题，我们优化了CKKS方案中的旋转数据路径，使其对硬件友好，并提出了一种适合FHE工作负载的同态评估集群硬件加速器。我们的架构意识到现场可编程门阵列（fpga）的计算和内存限制，并在单个集群中执行数论变换（NTT），其逆变换（INTT），键乘法，基转换和自同构。我们在AMD Alveo U280 FPGA平台上实现了我们的设计。多项式长度为216，工作频率为250 MHz作为旋转加速器，与OpenFHE中的CPU实现相比，FPGA上的设计实现的速度提高了约700倍。与GPU实现相比，它的速度提高了1.77倍，与以前的FPGA实现相比，它的速度提高了1.13倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE open journal of circuits and systems

自引率

0.00%

发文量

审稿时长

19 weeks