{"title":"一类群上可验证延迟函数平方的一个有效加速器","authors":"Danyang Zhu, Yifeng Song, Jing Tian, Zhongfeng Wang, Haobo Yu","doi":"10.1109/APCCAS50809.2020.9301680","DOIUrl":null,"url":null,"abstract":"Nowadays, the verifiable delay function (VDF) is widely regarded as the core function for the next-generation blockchain system because it is slow to evaluate but easy to verify. In general, the squaring operation takes a significant proportion of VDF computation. Moreover, the squaring over a class group, including large-number extended greatest common divisor (GCD) computations, divisions, and multiplications, is extremely hard to be accelerated in hardware. In this paper, for the first time, we propose an efficient architecture for squaring by utilizing many algorithmic transformations and architectural optimizations to reduce the critical path and calculation cycles. Firstly, the squaring algorithm is modified to achieve partial parallel computing, and a very hardware-efficient extended GCD algorithm is selected to reduce the whole computation cycles. Secondly, highly-parallelized architectures for large-number division and multiplication are devised respectively. Finally, the proposed architecture is coded using hardware description language (HDL) and synthesized under the TSMC 28-nm CMOS technology. The synthesis results show that the proposed design with the input width of 2048 bits averagely takes 6.319us per squaring at a frequency of 500 MHz. Compared to the original squaring with the same setting running over an Intel(R) Core(TM) i7-6850K 3.60GHz CPU, our design achieves about 2x speedup.","PeriodicalId":127075,"journal":{"name":"2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"32 11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"An Efficient Accelerator of the Squaring for the Verifiable Delay Function Over a Class Group\",\"authors\":\"Danyang Zhu, Yifeng Song, Jing Tian, Zhongfeng Wang, Haobo Yu\",\"doi\":\"10.1109/APCCAS50809.2020.9301680\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, the verifiable delay function (VDF) is widely regarded as the core function for the next-generation blockchain system because it is slow to evaluate but easy to verify. In general, the squaring operation takes a significant proportion of VDF computation. Moreover, the squaring over a class group, including large-number extended greatest common divisor (GCD) computations, divisions, and multiplications, is extremely hard to be accelerated in hardware. In this paper, for the first time, we propose an efficient architecture for squaring by utilizing many algorithmic transformations and architectural optimizations to reduce the critical path and calculation cycles. Firstly, the squaring algorithm is modified to achieve partial parallel computing, and a very hardware-efficient extended GCD algorithm is selected to reduce the whole computation cycles. Secondly, highly-parallelized architectures for large-number division and multiplication are devised respectively. Finally, the proposed architecture is coded using hardware description language (HDL) and synthesized under the TSMC 28-nm CMOS technology. The synthesis results show that the proposed design with the input width of 2048 bits averagely takes 6.319us per squaring at a frequency of 500 MHz. Compared to the original squaring with the same setting running over an Intel(R) Core(TM) i7-6850K 3.60GHz CPU, our design achieves about 2x speedup.\",\"PeriodicalId\":127075,\"journal\":{\"name\":\"2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)\",\"volume\":\"32 11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APCCAS50809.2020.9301680\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APCCAS50809.2020.9301680","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Efficient Accelerator of the Squaring for the Verifiable Delay Function Over a Class Group
Nowadays, the verifiable delay function (VDF) is widely regarded as the core function for the next-generation blockchain system because it is slow to evaluate but easy to verify. In general, the squaring operation takes a significant proportion of VDF computation. Moreover, the squaring over a class group, including large-number extended greatest common divisor (GCD) computations, divisions, and multiplications, is extremely hard to be accelerated in hardware. In this paper, for the first time, we propose an efficient architecture for squaring by utilizing many algorithmic transformations and architectural optimizations to reduce the critical path and calculation cycles. Firstly, the squaring algorithm is modified to achieve partial parallel computing, and a very hardware-efficient extended GCD algorithm is selected to reduce the whole computation cycles. Secondly, highly-parallelized architectures for large-number division and multiplication are devised respectively. Finally, the proposed architecture is coded using hardware description language (HDL) and synthesized under the TSMC 28-nm CMOS technology. The synthesis results show that the proposed design with the input width of 2048 bits averagely takes 6.319us per squaring at a frequency of 500 MHz. Compared to the original squaring with the same setting running over an Intel(R) Core(TM) i7-6850K 3.60GHz CPU, our design achieves about 2x speedup.