An Efficient Accelerator of the Squaring for the Verifiable Delay Function Over a Class Group

Danyang Zhu, Yifeng Song, Jing Tian, Zhongfeng Wang, Haobo Yu
{"title":"An Efficient Accelerator of the Squaring for the Verifiable Delay Function Over a Class Group","authors":"Danyang Zhu, Yifeng Song, Jing Tian, Zhongfeng Wang, Haobo Yu","doi":"10.1109/APCCAS50809.2020.9301680","DOIUrl":null,"url":null,"abstract":"Nowadays, the verifiable delay function (VDF) is widely regarded as the core function for the next-generation blockchain system because it is slow to evaluate but easy to verify. In general, the squaring operation takes a significant proportion of VDF computation. Moreover, the squaring over a class group, including large-number extended greatest common divisor (GCD) computations, divisions, and multiplications, is extremely hard to be accelerated in hardware. In this paper, for the first time, we propose an efficient architecture for squaring by utilizing many algorithmic transformations and architectural optimizations to reduce the critical path and calculation cycles. Firstly, the squaring algorithm is modified to achieve partial parallel computing, and a very hardware-efficient extended GCD algorithm is selected to reduce the whole computation cycles. Secondly, highly-parallelized architectures for large-number division and multiplication are devised respectively. Finally, the proposed architecture is coded using hardware description language (HDL) and synthesized under the TSMC 28-nm CMOS technology. The synthesis results show that the proposed design with the input width of 2048 bits averagely takes 6.319us per squaring at a frequency of 500 MHz. Compared to the original squaring with the same setting running over an Intel(R) Core(TM) i7-6850K 3.60GHz CPU, our design achieves about 2x speedup.","PeriodicalId":127075,"journal":{"name":"2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"32 11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APCCAS50809.2020.9301680","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Nowadays, the verifiable delay function (VDF) is widely regarded as the core function for the next-generation blockchain system because it is slow to evaluate but easy to verify. In general, the squaring operation takes a significant proportion of VDF computation. Moreover, the squaring over a class group, including large-number extended greatest common divisor (GCD) computations, divisions, and multiplications, is extremely hard to be accelerated in hardware. In this paper, for the first time, we propose an efficient architecture for squaring by utilizing many algorithmic transformations and architectural optimizations to reduce the critical path and calculation cycles. Firstly, the squaring algorithm is modified to achieve partial parallel computing, and a very hardware-efficient extended GCD algorithm is selected to reduce the whole computation cycles. Secondly, highly-parallelized architectures for large-number division and multiplication are devised respectively. Finally, the proposed architecture is coded using hardware description language (HDL) and synthesized under the TSMC 28-nm CMOS technology. The synthesis results show that the proposed design with the input width of 2048 bits averagely takes 6.319us per squaring at a frequency of 500 MHz. Compared to the original squaring with the same setting running over an Intel(R) Core(TM) i7-6850K 3.60GHz CPU, our design achieves about 2x speedup.
一类群上可验证延迟函数平方的一个有效加速器
目前,可验证延迟函数(VDF)被广泛认为是下一代区块链系统的核心功能,因为它的评估速度慢,但易于验证。一般来说,平方运算在VDF计算中占很大比例。此外,类群上的平方,包括大数扩展最大公约数(GCD)计算、除法和乘法,在硬件上很难加速。在本文中,我们首次提出了一种有效的平方架构,通过利用许多算法转换和架构优化来减少关键路径和计算周期。首先,对平方算法进行改进以实现部分并行计算,并选择一种硬件效率很高的扩展GCD算法来缩短整个计算周期。其次,分别设计了大数除法和乘法的高度并行化体系结构。最后,采用硬件描述语言(HDL)对所提出的架构进行编码,并在台积电28纳米CMOS技术下进行合成。综合结果表明,该设计在输入宽度为2048比特时,在500 MHz频率下平均每平方占用6.319us。与在Intel(R) Core(TM) i7-6850K 3.60GHz CPU上运行相同设置的原始正方形相比,我们的设计实现了大约2倍的加速。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信