Enhancing the Scalability and Memory Usage of Hashsieve on Multi-core CPUs

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) Pub Date : 2016-04-04 DOI:10.1109/PDP.2016.31

Artur Mariano, C. Bischof

{"title":"Enhancing the Scalability and Memory Usage of Hashsieve on Multi-core CPUs","authors":"Artur Mariano, C. Bischof","doi":"10.1109/PDP.2016.31","DOIUrl":null,"url":null,"abstract":"The Shortest Vector Problem (SVP) is a key problem in lattice-based cryptography and cryptanalysis. While the cryptography community has accumulated a vast knowledge of SVP-solvers from a theoretical standpoint, the practical performance of these algorithms is commonly not well understood. This gap in knowledge poses many challenges to cryptographers, who are oftentimes confronted with algorithms that perform worse in practice then expected from theory. This is a problem because the asymptotic complexity of the best algorithms plays a key role in the construction of cryptosystems, but only practically appealing, validated algorithms are accounted for in this process. Thus, if one cannot extract the full potential of theoretically strong algorithms in practice, efficient algorithms might be ruled out and wrong assumptions are made when constructing cryptosystems. In this paper, we take a step forward to fill this gap, by providing a computational analysis of HashSieve, the most practical sieving SVP-solver to date, and showing how its performance can be enhanced in practice. To this end, we revisit the parallel generation of random numbers, memory allocation and memory access patterns. Employing scalable random sampling, object memory pools, scalable memory allocators and aggressive memory prefetching, we were able to improve the best current implementation of HashSieve by factors of 3x and 4x, depending on the lattice dimension, and set new records for the HashSieve algorithm, thereby shrinking the gap between its theoretical complexity and its performance in practice.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP.2016.31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

The Shortest Vector Problem (SVP) is a key problem in lattice-based cryptography and cryptanalysis. While the cryptography community has accumulated a vast knowledge of SVP-solvers from a theoretical standpoint, the practical performance of these algorithms is commonly not well understood. This gap in knowledge poses many challenges to cryptographers, who are oftentimes confronted with algorithms that perform worse in practice then expected from theory. This is a problem because the asymptotic complexity of the best algorithms plays a key role in the construction of cryptosystems, but only practically appealing, validated algorithms are accounted for in this process. Thus, if one cannot extract the full potential of theoretically strong algorithms in practice, efficient algorithms might be ruled out and wrong assumptions are made when constructing cryptosystems. In this paper, we take a step forward to fill this gap, by providing a computational analysis of HashSieve, the most practical sieving SVP-solver to date, and showing how its performance can be enhanced in practice. To this end, we revisit the parallel generation of random numbers, memory allocation and memory access patterns. Employing scalable random sampling, object memory pools, scalable memory allocators and aggressive memory prefetching, we were able to improve the best current implementation of HashSieve by factors of 3x and 4x, depending on the lattice dimension, and set new records for the HashSieve algorithm, thereby shrinking the gap between its theoretical complexity and its performance in practice.

查看原文本刊更多论文

增强Hashsieve在多核cpu上的可扩展性和内存使用

最短向量问题(SVP)是格密码学和密码分析中的一个关键问题。虽然密码学社区已经从理论的角度积累了大量的svp求解器知识，但这些算法的实际性能通常还没有得到很好的理解。这种知识上的差距给密码学家带来了许多挑战，他们经常遇到在实践中表现不如理论预期的算法。这是一个问题，因为最佳算法的渐近复杂性在密码系统的构建中起着关键作用，但在这个过程中只考虑实际吸引人的、经过验证的算法。因此，如果不能在实践中充分发挥理论上强大的算法的潜力，那么在构建密码系统时可能会排除有效的算法并做出错误的假设。在本文中，我们通过提供HashSieve(迄今为止最实用的筛选svp求解器)的计算分析，并展示了如何在实践中增强其性能，向前迈出了一步来填补这一空白。为此，我们回顾了随机数的并行生成、内存分配和内存访问模式。通过采用可扩展随机抽样、对象内存池、可扩展内存分分器和主动内存预取，我们能够根据晶格维度将当前最佳的HashSieve实现提高3倍和4倍，并为HashSieve算法创造了新的记录，从而缩小了其理论复杂性与实践性能之间的差距。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)

自引率

0.00%

发文量