An Efficient Accelerator of the Squaring for the Verifiable Delay Function Over a Class Group

2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) Pub Date : 2020-12-08 DOI:10.1109/APCCAS50809.2020.9301680

Danyang Zhu, Yifeng Song, Jing Tian, Zhongfeng Wang, Haobo Yu

{"title":"An Efficient Accelerator of the Squaring for the Verifiable Delay Function Over a Class Group","authors":"Danyang Zhu, Yifeng Song, Jing Tian, Zhongfeng Wang, Haobo Yu","doi":"10.1109/APCCAS50809.2020.9301680","DOIUrl":null,"url":null,"abstract":"Nowadays, the verifiable delay function (VDF) is widely regarded as the core function for the next-generation blockchain system because it is slow to evaluate but easy to verify. In general, the squaring operation takes a significant proportion of VDF computation. Moreover, the squaring over a class group, including large-number extended greatest common divisor (GCD) computations, divisions, and multiplications, is extremely hard to be accelerated in hardware. In this paper, for the first time, we propose an efficient architecture for squaring by utilizing many algorithmic transformations and architectural optimizations to reduce the critical path and calculation cycles. Firstly, the squaring algorithm is modified to achieve partial parallel computing, and a very hardware-efficient extended GCD algorithm is selected to reduce the whole computation cycles. Secondly, highly-parallelized architectures for large-number division and multiplication are devised respectively. Finally, the proposed architecture is coded using hardware description language (HDL) and synthesized under the TSMC 28-nm CMOS technology. The synthesis results show that the proposed design with the input width of 2048 bits averagely takes 6.319us per squaring at a frequency of 500 MHz. Compared to the original squaring with the same setting running over an Intel(R) Core(TM) i7-6850K 3.60GHz CPU, our design achieves about 2x speedup.","PeriodicalId":127075,"journal":{"name":"2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"32 11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APCCAS50809.2020.9301680","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Nowadays, the verifiable delay function (VDF) is widely regarded as the core function for the next-generation blockchain system because it is slow to evaluate but easy to verify. In general, the squaring operation takes a significant proportion of VDF computation. Moreover, the squaring over a class group, including large-number extended greatest common divisor (GCD) computations, divisions, and multiplications, is extremely hard to be accelerated in hardware. In this paper, for the first time, we propose an efficient architecture for squaring by utilizing many algorithmic transformations and architectural optimizations to reduce the critical path and calculation cycles. Firstly, the squaring algorithm is modified to achieve partial parallel computing, and a very hardware-efficient extended GCD algorithm is selected to reduce the whole computation cycles. Secondly, highly-parallelized architectures for large-number division and multiplication are devised respectively. Finally, the proposed architecture is coded using hardware description language (HDL) and synthesized under the TSMC 28-nm CMOS technology. The synthesis results show that the proposed design with the input width of 2048 bits averagely takes 6.319us per squaring at a frequency of 500 MHz. Compared to the original squaring with the same setting running over an Intel(R) Core(TM) i7-6850K 3.60GHz CPU, our design achieves about 2x speedup.

查看原文本刊更多论文

一类群上可验证延迟函数平方的一个有效加速器

目前，可验证延迟函数(VDF)被广泛认为是下一代区块链系统的核心功能，因为它的评估速度慢，但易于验证。一般来说，平方运算在VDF计算中占很大比例。此外，类群上的平方，包括大数扩展最大公约数(GCD)计算、除法和乘法，在硬件上很难加速。在本文中，我们首次提出了一种有效的平方架构，通过利用许多算法转换和架构优化来减少关键路径和计算周期。首先，对平方算法进行改进以实现部分并行计算，并选择一种硬件效率很高的扩展GCD算法来缩短整个计算周期。其次，分别设计了大数除法和乘法的高度并行化体系结构。最后，采用硬件描述语言(HDL)对所提出的架构进行编码，并在台积电28纳米CMOS技术下进行合成。综合结果表明，该设计在输入宽度为2048比特时，在500 MHz频率下平均每平方占用6.319us。与在Intel(R) Core(TM) i7-6850K 3.60GHz CPU上运行相同设置的原始正方形相比，我们的设计实现了大约2倍的加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)

自引率

0.00%

发文量