Myosotis:一种基于数据共享的高效管道化参数化多标量乘法架构

IF 2.9 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Changxu Liu;Hao Zhou;Lan Yang;Zheng Wu;Patrick Dai;Yinlong Li;Shiyong Wu;Fan Yang
{"title":"Myosotis:一种基于数据共享的高效管道化参数化多标量乘法架构","authors":"Changxu Liu;Hao Zhou;Lan Yang;Zheng Wu;Patrick Dai;Yinlong Li;Shiyong Wu;Fan Yang","doi":"10.1109/TCAD.2024.3524364","DOIUrl":null,"url":null,"abstract":"Zero-knowledge proof (ZKP) is a widely used privacy-preserving technology, where multiscalar multiplication (MSM) accounts for over 70% of the computational workload. The acceleration of MSM can enhance the overall performance of ZKP, making it a focal point of community attention. However, in practical applications involving the deployment of multiple MSM accelerators, existing designs often overlook strategies for optimizing bandwidth and area efficiency. To address this, we propose Myosotis, an efficiently pipelined and parameterized MSM architecture. By sharing input data and allocating cache effectively, it mitigates average transmission bandwidth in runtime. Myosotis also supports the use of multiple point addition (PADD) units to achieve performance gains, balancing area overhead and latency for improved area efficiency. Different parameter selection enables a tradeoff between the performance, area, and bandwidth of the MSM accelerator. When benchmarking with MSM degrees between <inline-formula> <tex-math>$2^{18}$ </tex-math></inline-formula> and <inline-formula> <tex-math>$2^{26}$ </tex-math></inline-formula>, our proposed baseline design achieves up to <inline-formula> <tex-math>$3.32\\times $ </tex-math></inline-formula> and <inline-formula> <tex-math>$6.72\\times $ </tex-math></inline-formula> speedups over state-of-the-art FPGA and ASIC designs. Compared to the baseline, Myosotis with two window MSMs and one PADD unit reduces bandwidth demand by 43% while maintaining similar area and latency. On the other hand, Myosotis with three window MSMs and two PADD units decreases latency by 43% and bandwidth by 17%, with only a 9% area increase.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 7","pages":"2738-2750"},"PeriodicalIF":2.9000,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Myosotis: An Efficiently Pipelined and Parameterized Multiscalar Multiplication Architecture via Data Sharing\",\"authors\":\"Changxu Liu;Hao Zhou;Lan Yang;Zheng Wu;Patrick Dai;Yinlong Li;Shiyong Wu;Fan Yang\",\"doi\":\"10.1109/TCAD.2024.3524364\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Zero-knowledge proof (ZKP) is a widely used privacy-preserving technology, where multiscalar multiplication (MSM) accounts for over 70% of the computational workload. The acceleration of MSM can enhance the overall performance of ZKP, making it a focal point of community attention. However, in practical applications involving the deployment of multiple MSM accelerators, existing designs often overlook strategies for optimizing bandwidth and area efficiency. To address this, we propose Myosotis, an efficiently pipelined and parameterized MSM architecture. By sharing input data and allocating cache effectively, it mitigates average transmission bandwidth in runtime. Myosotis also supports the use of multiple point addition (PADD) units to achieve performance gains, balancing area overhead and latency for improved area efficiency. Different parameter selection enables a tradeoff between the performance, area, and bandwidth of the MSM accelerator. When benchmarking with MSM degrees between <inline-formula> <tex-math>$2^{18}$ </tex-math></inline-formula> and <inline-formula> <tex-math>$2^{26}$ </tex-math></inline-formula>, our proposed baseline design achieves up to <inline-formula> <tex-math>$3.32\\\\times $ </tex-math></inline-formula> and <inline-formula> <tex-math>$6.72\\\\times $ </tex-math></inline-formula> speedups over state-of-the-art FPGA and ASIC designs. Compared to the baseline, Myosotis with two window MSMs and one PADD unit reduces bandwidth demand by 43% while maintaining similar area and latency. On the other hand, Myosotis with three window MSMs and two PADD units decreases latency by 43% and bandwidth by 17%, with only a 9% area increase.\",\"PeriodicalId\":13251,\"journal\":{\"name\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"volume\":\"44 7\",\"pages\":\"2738-2750\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10818748/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10818748/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

零知识证明(ZKP)是一种广泛使用的隐私保护技术,其中多标量乘法(MSM)占计算工作量的70%以上。MSM的加速可以提高ZKP的整体性能,使其成为社会关注的焦点。然而,在涉及多个MSM加速器部署的实际应用中,现有的设计往往忽略了优化带宽和面积效率的策略。为了解决这个问题,我们提出了Myosotis,一个有效的流水线和参数化的MSM架构。通过有效地共享输入数据和分配缓存,降低了运行时的平均传输带宽。Myosotis还支持使用多点加法(PADD)单元来实现性能提升,平衡区域开销和延迟,以提高区域效率。不同的参数选择可以在MSM加速器的性能、面积和带宽之间进行权衡。当MSM度在$2^{18}$和$2^{26}$之间进行基准测试时,我们提出的基准设计比最先进的FPGA和ASIC设计实现了高达$3.32和$6.72 $的加速。与基线相比,具有两个窗口msm和一个PADD单元的Myosotis在保持相似的面积和延迟的同时减少了43%的带宽需求。另一方面,具有三个窗口msm和两个PADD单元的Myosotis可减少43%的延迟和17%的带宽,仅增加9%的面积。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Myosotis: An Efficiently Pipelined and Parameterized Multiscalar Multiplication Architecture via Data Sharing
Zero-knowledge proof (ZKP) is a widely used privacy-preserving technology, where multiscalar multiplication (MSM) accounts for over 70% of the computational workload. The acceleration of MSM can enhance the overall performance of ZKP, making it a focal point of community attention. However, in practical applications involving the deployment of multiple MSM accelerators, existing designs often overlook strategies for optimizing bandwidth and area efficiency. To address this, we propose Myosotis, an efficiently pipelined and parameterized MSM architecture. By sharing input data and allocating cache effectively, it mitigates average transmission bandwidth in runtime. Myosotis also supports the use of multiple point addition (PADD) units to achieve performance gains, balancing area overhead and latency for improved area efficiency. Different parameter selection enables a tradeoff between the performance, area, and bandwidth of the MSM accelerator. When benchmarking with MSM degrees between $2^{18}$ and $2^{26}$ , our proposed baseline design achieves up to $3.32\times $ and $6.72\times $ speedups over state-of-the-art FPGA and ASIC designs. Compared to the baseline, Myosotis with two window MSMs and one PADD unit reduces bandwidth demand by 43% while maintaining similar area and latency. On the other hand, Myosotis with three window MSMs and two PADD units decreases latency by 43% and bandwidth by 17%, with only a 9% area increase.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
5.60
自引率
13.80%
发文量
500
审稿时长
7 months
期刊介绍: The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信