A lightweight hardware implementation of CRYSTALS-Kyber

Shiyang He , Hui Li , Fenghua Li , Ruhui Ma
{"title":"A lightweight hardware implementation of CRYSTALS-Kyber","authors":"Shiyang He ,&nbsp;Hui Li ,&nbsp;Fenghua Li ,&nbsp;Ruhui Ma","doi":"10.1016/j.jiixd.2024.02.004","DOIUrl":null,"url":null,"abstract":"<div><p>The security of cryptographic algorithms based on integer factorization and discrete logarithm will be threatened by quantum computers in future. Since December 2016, the National Institute of Standards and Technology (NIST) has begun to solicit post-quantum cryptographic (PQC) algorithms worldwide. CRYSTALS-Kyber was selected as the standard of PQC algorithm after 3 rounds of evaluation. Meanwhile considering the large resource consumption of current implementation, this paper presents a lightweight architecture for ASICs and its implementation on FPGAs for prototyping. In this implementation, a novel compact modular multiplication unit (MMU) and compression/decompression module is proposed to save hardware resources. We put forward a specially optimized schoolbook polynomial multiplication (SPM) instead of number theoretic transform (NTT) core for polynomial multiplication, which can reduce about 74% SLICE cost. We also use signed number representation to save memory resources. In addition, we optimize the hardware implementation of the Hash module, which cuts off about 48% of FF consumption by register reuse technology. Our design can be implemented on Kintex-7 (XC7K325T-2FFG900I) FPGA for prototyping, which occupations of 4777/4993 LUTs, 2661/2765 FFs, 1395/1452 SLICEs, 2.5/2.5 BRAMs, and 0/0 DSP respective of client/server side. The maximum clock frequency can reach at 244 ​MHz. As far as we know, our design consumes the least resources compared with other existing designs, which is very friendly to resource-constrained devices.</p></div>","PeriodicalId":100790,"journal":{"name":"Journal of Information and Intelligence","volume":"2 2","pages":"Pages 167-176"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S294971592400009X/pdfft?md5=554b4ca1fa191ff4a92f726744e62d79&pid=1-s2.0-S294971592400009X-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information and Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S294971592400009X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The security of cryptographic algorithms based on integer factorization and discrete logarithm will be threatened by quantum computers in future. Since December 2016, the National Institute of Standards and Technology (NIST) has begun to solicit post-quantum cryptographic (PQC) algorithms worldwide. CRYSTALS-Kyber was selected as the standard of PQC algorithm after 3 rounds of evaluation. Meanwhile considering the large resource consumption of current implementation, this paper presents a lightweight architecture for ASICs and its implementation on FPGAs for prototyping. In this implementation, a novel compact modular multiplication unit (MMU) and compression/decompression module is proposed to save hardware resources. We put forward a specially optimized schoolbook polynomial multiplication (SPM) instead of number theoretic transform (NTT) core for polynomial multiplication, which can reduce about 74% SLICE cost. We also use signed number representation to save memory resources. In addition, we optimize the hardware implementation of the Hash module, which cuts off about 48% of FF consumption by register reuse technology. Our design can be implemented on Kintex-7 (XC7K325T-2FFG900I) FPGA for prototyping, which occupations of 4777/4993 LUTs, 2661/2765 FFs, 1395/1452 SLICEs, 2.5/2.5 BRAMs, and 0/0 DSP respective of client/server side. The maximum clock frequency can reach at 244 ​MHz. As far as we know, our design consumes the least resources compared with other existing designs, which is very friendly to resource-constrained devices.

CRYSTALS-Kyber 的轻量级硬件实现
未来,基于整数因式分解和离散对数的加密算法的安全性将受到量子计算机的威胁。自2016年12月起,美国国家标准与技术研究院(NIST)开始在全球范围内征集后量子密码算法(PQC)。经过3轮评审,CRYSTALS-Kyber被选为PQC算法标准。同时,考虑到目前的实现方式需要消耗大量资源,本文提出了一种适用于 ASIC 的轻量级架构,并将其实现在 FPGA 上,用于原型开发。在实现过程中,我们提出了一种新颖紧凑的模块化乘法单元(MMU)和压缩/解压缩模块,以节省硬件资源。我们提出了一个专门优化的校本多项式乘法(SPM)来代替多项式乘法的数论变换(NTT)核,这可以减少约 74% 的 SLICE 成本。我们还使用有符号数表示法来节省内存资源。此外,我们还优化了哈希模块的硬件实现,通过寄存器重用技术减少了约 48% 的 FF 消耗。我们的设计可在 Kintex-7 (XC7K325T-2FFG900I) FPGA 上实现,用于原型开发,它占用 4777/4993 个 LUT、2661/2765 个 FF、1395/1452 个 SLICE、2.5/2.5 个 BRAM 以及客户端/服务器端各自的 0/0 个 DSP。最高时钟频率可达 244 MHz。据我们所知,与其他现有设计相比,我们的设计消耗的资源最少,这对资源有限的设备非常友好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信