{"title":"Area-Efficient Modular Multiplication on FPGA","authors":"Yujun Xie;Yuan Liu","doi":"10.1109/TCSII.2025.3585441","DOIUrl":null,"url":null,"abstract":"Modular multiplication (MM) involves multiplication and modular reduction. In this brief, we explore an area-efficient modular reduction for MM on FPGA. We analyze and compare the equivalent LUT6 (ELUT6) cost when implementing modular reduction using different memory strategies (BRAM/LUT6/LUT5), and adopt LUT5 (lowest ELUT6 cost) as the memory for this design. Then we propose an area-efficient compression strategy with a new (1,5;3) Generalized Parallel Counter (GPC), which reduces the LUT6 cost of compression operation in modular reduction compared to previous methods. Finally, we adopt the 4-term Karatsuba algorithm to reduce the area of multiplication, and explore the balance of hardware delay in MM. The proposed MM is implemented on the Xilinx Virtex-7 platform. Compared to the previous state-of-art pipeline design, the area of proposed MM is only 41.7%/47.6%/47.6%/50.0% of them when word-size <inline-formula> <tex-math>$w {=}32$ </tex-math></inline-formula>/64/128/256.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 9","pages":"1253-1257"},"PeriodicalIF":4.9000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems II: Express Briefs","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11063352/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Modular multiplication (MM) involves multiplication and modular reduction. In this brief, we explore an area-efficient modular reduction for MM on FPGA. We analyze and compare the equivalent LUT6 (ELUT6) cost when implementing modular reduction using different memory strategies (BRAM/LUT6/LUT5), and adopt LUT5 (lowest ELUT6 cost) as the memory for this design. Then we propose an area-efficient compression strategy with a new (1,5;3) Generalized Parallel Counter (GPC), which reduces the LUT6 cost of compression operation in modular reduction compared to previous methods. Finally, we adopt the 4-term Karatsuba algorithm to reduce the area of multiplication, and explore the balance of hardware delay in MM. The proposed MM is implemented on the Xilinx Virtex-7 platform. Compared to the previous state-of-art pipeline design, the area of proposed MM is only 41.7%/47.6%/47.6%/50.0% of them when word-size $w {=}32$ /64/128/256.
期刊介绍:
TCAS II publishes brief papers in the field specified by the theory, analysis, design, and practical implementations of circuits, and the application of circuit techniques to systems and to signal processing. Included is the whole spectrum from basic scientific theory to industrial applications. The field of interest covered includes:
Circuits: Analog, Digital and Mixed Signal Circuits and Systems
Nonlinear Circuits and Systems, Integrated Sensors, MEMS and Systems on Chip, Nanoscale Circuits and Systems, Optoelectronic
Circuits and Systems, Power Electronics and Systems
Software for Analog-and-Logic Circuits and Systems
Control aspects of Circuits and Systems.