Performance and Communication Cost of Hardware Accelerators for Hashing in Post-Quantum Cryptography

IF 2.8 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Patrick Karl, Jonas Schupp, Georg Sigl
{"title":"Performance and Communication Cost of Hardware Accelerators for Hashing in Post-Quantum Cryptography","authors":"Patrick Karl, Jonas Schupp, Georg Sigl","doi":"10.1145/3676965","DOIUrl":null,"url":null,"abstract":"SPHINCS+ is a signature scheme included in the first NIST post-quantum standard, that bases its security on the underlying hash primitive. As most of the runtime of SPHINCS+ is caused by the evaluation of several hash- and pseudo-random functions, offloading this computation to dedicated hardware accelerators is a natural step. In this work, we evaluate different architectures for hardware acceleration of such a hash primitive with respect to its use-case and evaluate them in the context of SPHINCS+. We attach hardware accelerators for different hash primitives (SHAKE128 and Ascon-Xof for both, full and round-reduced versions) to CPU interfaces having different transfer speeds. We show, that for most use-cases, data transfer determines the overall performance if accelerators are equipped with FIFOs and that reducing the number of rounds in the permutation does not necessarily lead to significant performance improvements when using hardware acceleration.\n This work extends on a conference paper accepted at COSADE’24, first published in [19], and written by the same authors, where different architectures for hardware accelerators of hash functions are benchmarked and evaluated for SPHINCS+ as a case study. In this paper, we provide results for additional parameter sets for SPHINCS+ and improve the performance of one of the accelerators by adding an additional RISC-V instruction for faster absorption. We then extend the performance benchmark by including the algorithms CRYSTALS-Kyber, CRYSTALS-Dilithium and Falcon. Finally we provide a power/energy comparison for the accelerators.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Embedded Computing Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3676965","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

SPHINCS+ is a signature scheme included in the first NIST post-quantum standard, that bases its security on the underlying hash primitive. As most of the runtime of SPHINCS+ is caused by the evaluation of several hash- and pseudo-random functions, offloading this computation to dedicated hardware accelerators is a natural step. In this work, we evaluate different architectures for hardware acceleration of such a hash primitive with respect to its use-case and evaluate them in the context of SPHINCS+. We attach hardware accelerators for different hash primitives (SHAKE128 and Ascon-Xof for both, full and round-reduced versions) to CPU interfaces having different transfer speeds. We show, that for most use-cases, data transfer determines the overall performance if accelerators are equipped with FIFOs and that reducing the number of rounds in the permutation does not necessarily lead to significant performance improvements when using hardware acceleration. This work extends on a conference paper accepted at COSADE’24, first published in [19], and written by the same authors, where different architectures for hardware accelerators of hash functions are benchmarked and evaluated for SPHINCS+ as a case study. In this paper, we provide results for additional parameter sets for SPHINCS+ and improve the performance of one of the accelerators by adding an additional RISC-V instruction for faster absorption. We then extend the performance benchmark by including the algorithms CRYSTALS-Kyber, CRYSTALS-Dilithium and Falcon. Finally we provide a power/energy comparison for the accelerators.
后量子密码学哈希算法硬件加速器的性能和通信成本
SPHINCS+ 是首个 NIST 后量子标准中的签名方案,其安全性基于底层散列原语。由于 SPHINCS+ 的大部分运行时间是由多个哈希和伪随机函数的评估造成的,因此将这种计算卸载到专用硬件加速器上是很自然的一步。在这项工作中,我们根据哈希基元的使用情况评估了硬件加速哈希基元的不同架构,并在 SPHINCS+ 的背景下对它们进行了评估。我们将不同散列原语(SHAKE128 和 Ascon-Xof,包括完整版和回合缩减版)的硬件加速器连接到具有不同传输速度的 CPU 接口上。我们的研究表明,对于大多数用例,如果加速器配备了先进先出设备,数据传输将决定整体性能,而且在使用硬件加速时,减少排列中的轮数并不一定能显著提高性能。这项工作是对 COSADE'24 会议接受的一篇会议论文的延伸,该论文首次发表于 [19],由同一作者撰写,其中以 SPHINCS+ 为案例研究,对散列式函数硬件加速器的不同架构进行了基准测试和评估。在本文中,我们为 SPHINCS+ 提供了额外参数集的结果,并通过添加额外的 RISC-V 指令提高了其中一个加速器的性能,从而加快了吸收速度。然后,我们扩展了性能基准,纳入了 CRYSTALS-Kyber、CRYSTALS-Dilithium 和 Falcon 算法。最后,我们提供了加速器的功耗/能耗比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems 工程技术-计算机:软件工程
CiteScore
3.70
自引率
0.00%
发文量
138
审稿时长
6 months
期刊介绍: The design of embedded computing systems, both the software and hardware, increasingly relies on sophisticated algorithms, analytical models, and methodologies. ACM Transactions on Embedded Computing Systems (TECS) aims to present the leading work relating to the analysis, design, behavior, and experience with embedded computing systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信