Jianming Lin , Weize Wang , Chang-An Zhao , Yuhao Zheng
{"title":"Efficient Implementations of Square-root Vélu's Formulas","authors":"Jianming Lin , Weize Wang , Chang-An Zhao , Yuhao Zheng","doi":"10.1016/j.ipl.2025.106580","DOIUrl":null,"url":null,"abstract":"<div><div>In the implementation of isogeny-based cryptographic schemes, Vélu's formulas are essential for constructing and evaluating isogenies. Bernstein et al. proposed an approach known as √élu, which computes an <em>ℓ</em>-isogeny at a cost of <span><math><mover><mrow><mi>O</mi></mrow><mrow><mo>˜</mo></mrow></mover><mo>(</mo><msqrt><mrow><mi>ℓ</mi></mrow></msqrt><mo>)</mo></math></span> finite field operations. This paper presents two improvements to enhance the efficiency of the implementation of √élu as follows: optimizing the index system required in √élu and speeding up the computations of the sums of products used in polynomial multiplications over a finite field <span><math><msub><mrow><mi>F</mi></mrow><mrow><mi>p</mi></mrow></msub></math></span> with characteristic <em>p</em>. To optimize the index system, we modify it to enhance the utilization of <em>x</em>-coordinates and combine it with the technique of redundant representation, which can ultimately reduce the number of <span><math><msub><mrow><mi>F</mi></mrow><mrow><mi>p</mi></mrow></msub></math></span>-multiplications. The speedup of the sums of products is to employ two techniques: lazy reduction (abbreviated as LZYR) and generalized interleaved Montgomery multiplication (abbreviated as INTL). These techniques aim to minimize the underlying operations. We provide an optimized C and assembly implementation of √élu. For the computational cost (in terms of <span><math><msub><mrow><mi>F</mi></mrow><mrow><mi>p</mi></mrow></msub></math></span>-multiplications) of each isogeny involved in CTIDH2048 (resp. SQIsign), exploiting our modified index system (including the combination with redundant representation) obtains a saving up to 5.78% (resp. 5.39%) compared to the previous work. In terms of the performance (reported in CPU clock cycles) of isogeny computations in CTIDH512, applying our index system combined with INTL (resp. our index system combined with LZYR) offers an improvement up to 16.05% (resp. 10.96%) compared to the previous implementation. As for executing an isogeny group action in CTIDH512, our experimental results also demonstrate a reduction of 3.73% (resp. 1.83%) clock cycles by utilizing our index system combined with INTL (resp. our index system combined with LZYR).</div></div>","PeriodicalId":56290,"journal":{"name":"Information Processing Letters","volume":"190 ","pages":"Article 106580"},"PeriodicalIF":0.7000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020019025000249","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In the implementation of isogeny-based cryptographic schemes, Vélu's formulas are essential for constructing and evaluating isogenies. Bernstein et al. proposed an approach known as √élu, which computes an ℓ-isogeny at a cost of finite field operations. This paper presents two improvements to enhance the efficiency of the implementation of √élu as follows: optimizing the index system required in √élu and speeding up the computations of the sums of products used in polynomial multiplications over a finite field with characteristic p. To optimize the index system, we modify it to enhance the utilization of x-coordinates and combine it with the technique of redundant representation, which can ultimately reduce the number of -multiplications. The speedup of the sums of products is to employ two techniques: lazy reduction (abbreviated as LZYR) and generalized interleaved Montgomery multiplication (abbreviated as INTL). These techniques aim to minimize the underlying operations. We provide an optimized C and assembly implementation of √élu. For the computational cost (in terms of -multiplications) of each isogeny involved in CTIDH2048 (resp. SQIsign), exploiting our modified index system (including the combination with redundant representation) obtains a saving up to 5.78% (resp. 5.39%) compared to the previous work. In terms of the performance (reported in CPU clock cycles) of isogeny computations in CTIDH512, applying our index system combined with INTL (resp. our index system combined with LZYR) offers an improvement up to 16.05% (resp. 10.96%) compared to the previous implementation. As for executing an isogeny group action in CTIDH512, our experimental results also demonstrate a reduction of 3.73% (resp. 1.83%) clock cycles by utilizing our index system combined with INTL (resp. our index system combined with LZYR).
期刊介绍:
Information Processing Letters invites submission of original research articles that focus on fundamental aspects of information processing and computing. This naturally includes work in the broadly understood field of theoretical computer science; although papers in all areas of scientific inquiry will be given consideration, provided that they describe research contributions credibly motivated by applications to computing and involve rigorous methodology. High quality experimental papers that address topics of sufficiently broad interest may also be considered.
Since its inception in 1971, Information Processing Letters has served as a forum for timely dissemination of short, concise and focused research contributions. Continuing with this tradition, and to expedite the reviewing process, manuscripts are generally limited in length to nine pages when they appear in print.