A High-Speed NTT-Based Polynomial Multiplication Accelerator with Vector Extension of RISC-V for Saber Algorithm

2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) Pub Date : 2022-11-11 DOI:10.1109/APCCAS55924.2022.10090293

Honglin Kuang, Yifan Zhao, Jun Han

{"title":"A High-Speed NTT-Based Polynomial Multiplication Accelerator with Vector Extension of RISC-V for Saber Algorithm","authors":"Honglin Kuang, Yifan Zhao, Jun Han","doi":"10.1109/APCCAS55924.2022.10090293","DOIUrl":null,"url":null,"abstract":"Saber is a module-learning with rounding-based post-quantum cryptography (PQC) scheme for key encapsulation mechanism (KEM). It is characterized by the use of power-of-two moduli, which makes all modulus reductions free in hardware. However, such a decision prevents the direct implementation of the asymptotically fastest number theoretic transform (NTT) for the time-consuming polynomial multiplication in Saber. To efficiently multiply polynomials, researches have been done using a schoolbook or Toom-Cook or Karatsuba algorithm. Though these approaches result in decent operating speed at moderate area cost, they are disadvantageous when considering expanding the system to support multiple PQC protocols. To enable NTT for Saber, we choose an appropriate prime and use the sign-magnitude format for computation. A concise and efficient vectorized NTT algorithm has been proposed, based on which we design a configurable vector NTT unit to perform NTT and other arithmetic operations. The accelerator is dedicatedly pipelined to achieve high speed and is driven by custom vector instruction extension of RISC-V. We implement the proposed architecture with vector lanes of 32 and 16 on Xilinx UltraScale+ ZCU111. Results show that our design can achieve up to $5\\mathrm{x}$ and $3\\mathrm{x}$ improvement in computation time and area-time-product (ATP) respectively for degree-256 polynomials multiplication, compared to the state-of-the-art Saber polynomial multiplier counterparts.","PeriodicalId":243739,"journal":{"name":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APCCAS55924.2022.10090293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Saber is a module-learning with rounding-based post-quantum cryptography (PQC) scheme for key encapsulation mechanism (KEM). It is characterized by the use of power-of-two moduli, which makes all modulus reductions free in hardware. However, such a decision prevents the direct implementation of the asymptotically fastest number theoretic transform (NTT) for the time-consuming polynomial multiplication in Saber. To efficiently multiply polynomials, researches have been done using a schoolbook or Toom-Cook or Karatsuba algorithm. Though these approaches result in decent operating speed at moderate area cost, they are disadvantageous when considering expanding the system to support multiple PQC protocols. To enable NTT for Saber, we choose an appropriate prime and use the sign-magnitude format for computation. A concise and efficient vectorized NTT algorithm has been proposed, based on which we design a configurable vector NTT unit to perform NTT and other arithmetic operations. The accelerator is dedicatedly pipelined to achieve high speed and is driven by custom vector instruction extension of RISC-V. We implement the proposed architecture with vector lanes of 32 and 16 on Xilinx UltraScale+ ZCU111. Results show that our design can achieve up to $5\mathrm{x}$ and $3\mathrm{x}$ improvement in computation time and area-time-product (ATP) respectively for degree-256 polynomials multiplication, compared to the state-of-the-art Saber polynomial multiplier counterparts.

查看原文本刊更多论文

基于ntt的Saber算法高速多项式乘法加速器及RISC-V的矢量扩展

Saber是一种基于四舍五入的后量子加密(PQC)模式的模块学习密钥封装机制(KEM)。它的特点是使用二次模的幂，这使得所有的模约简在硬件上都是自由的。然而，这样的决定阻碍了在Saber中对耗时的多项式乘法直接实现渐近最快数论变换(NTT)。为了有效地乘多项式，研究人员使用教科书或Toom-Cook或Karatsuba算法进行了研究。虽然这些方法以中等的面积成本获得了良好的运行速度，但当考虑将系统扩展到支持多个PQC协议时，它们是不利的。为了在Saber中启用NTT，我们选择一个合适的素数并使用符号-幅度格式进行计算。提出了一种简洁高效的矢量NTT算法，并在此基础上设计了一个可配置的矢量NTT单元，实现NTT和其他算术运算。该加速器采用专用流水线实现高速，并由RISC-V自定义矢量指令扩展驱动。我们在Xilinx UltraScale+ ZCU111上实现了32和16个矢量通道的架构。结果表明，与最先进的Saber多项式乘法器相比，我们的设计可以在256次多项式乘法的计算时间和面积-时间积(ATP)方面分别实现高达$5\ mathm {x}$和$3\ mathm {x}$的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)

自引率

0.00%

发文量