{"title":"Efficient NTT/INTT processor for FALCON post-quantum cryptography","authors":"Ghada Alsuhli , Hani Saleh , Mahmoud Al-Qutayri , Baker Mohammad , Thanos Stouraitis","doi":"10.1016/j.jisa.2025.104177","DOIUrl":null,"url":null,"abstract":"<div><div>FALCON is a lattice-based post-quantum cryptographic (PQC) digital signature standard known for its compact signatures and resistance to quantum attacks. Since its recent standardization, its hardware implementation remains an open challenge, particularly for key generation, which is significantly more complex than the simple and well-studied signature verification process. In this paper, targeting edge devices with constrained resources, we present an energy-efficient and area-optimized NTT/INTT architecture tailored to the specific requirements of FALCON key generation. By leveraging NTT-friendly primes and reducing the size of the multipliers in the Montgomery reduction algorithm — optimized for ASIC implementation — our design minimizes hardware complexity, achieving the lowest power and area consumption compared to state-of-the-art Montgomery reduction implementations. The proposed hardware architecture features a processing element array, distributed SRAMs, and ROMs, with three levels of reconfigurability, supporting both NTT and INTT operations. Designed using the Global Foundries’ 22 nm FD-SOI process, an Application-Specific Integrated Circuit (ASIC) is estimated to occupy 0.04 mm<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span> and consume 18.2 mW at 1 GHz. The proposed processor achieves 700 times greater energy efficiency and performs computations 200 times faster than software implementations on the ARM Cortex-M4. It also achieves the lowest area–time product and highest energy efficiency among state-of-the-art NTT/INTT hardware accelerators. By carefully balancing power consumption and computational speed, this design offers an efficient solution for deploying FALCON key generation on devices with limited resources.</div></div>","PeriodicalId":48638,"journal":{"name":"Journal of Information Security and Applications","volume":"93 ","pages":"Article 104177"},"PeriodicalIF":3.7000,"publicationDate":"2025-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Security and Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214212625002145","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
FALCON is a lattice-based post-quantum cryptographic (PQC) digital signature standard known for its compact signatures and resistance to quantum attacks. Since its recent standardization, its hardware implementation remains an open challenge, particularly for key generation, which is significantly more complex than the simple and well-studied signature verification process. In this paper, targeting edge devices with constrained resources, we present an energy-efficient and area-optimized NTT/INTT architecture tailored to the specific requirements of FALCON key generation. By leveraging NTT-friendly primes and reducing the size of the multipliers in the Montgomery reduction algorithm — optimized for ASIC implementation — our design minimizes hardware complexity, achieving the lowest power and area consumption compared to state-of-the-art Montgomery reduction implementations. The proposed hardware architecture features a processing element array, distributed SRAMs, and ROMs, with three levels of reconfigurability, supporting both NTT and INTT operations. Designed using the Global Foundries’ 22 nm FD-SOI process, an Application-Specific Integrated Circuit (ASIC) is estimated to occupy 0.04 mm and consume 18.2 mW at 1 GHz. The proposed processor achieves 700 times greater energy efficiency and performs computations 200 times faster than software implementations on the ARM Cortex-M4. It also achieves the lowest area–time product and highest energy efficiency among state-of-the-art NTT/INTT hardware accelerators. By carefully balancing power consumption and computational speed, this design offers an efficient solution for deploying FALCON key generation on devices with limited resources.
期刊介绍:
Journal of Information Security and Applications (JISA) focuses on the original research and practice-driven applications with relevance to information security and applications. JISA provides a common linkage between a vibrant scientific and research community and industry professionals by offering a clear view on modern problems and challenges in information security, as well as identifying promising scientific and "best-practice" solutions. JISA issues offer a balance between original research work and innovative industrial approaches by internationally renowned information security experts and researchers.