GF(2)上快速高斯消去的并行硬件结构

2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Pub Date : 2006-04-24 DOI:10.1109/FCCM.2006.12

A. Bogdanov, M. Mertens, C. Paar, J. Pelzl, Andy Rupp

{"title":"GF(2)上快速高斯消去的并行硬件结构","authors":"A. Bogdanov, M. Mertens, C. Paar, J. Pelzl, Andy Rupp","doi":"10.1109/FCCM.2006.12","DOIUrl":null,"url":null,"abstract":"This paper presents a hardware-optimized variant of the well-known Gaussian elimination over GF(2) and its highly efficient implementation. The proposed hardware architecture can solve any regular and (uniquely solvable) overdetermined linear system of equations (LSE) and is not limited to matrices of a certain structure. Besides solving LSEs, the architecture at hand can also accomplish the related problem of matrix inversion extremely fast. Its average running time for n times n binary matrices with uniformly distributed entries equals 2n (clock cycles) as opposed to about frac14n3 in software. The average running time remains very close to 2n for matrices with densities much greater or lower than 0.5. The architecture has a worst-case time complexity of O(n2) and also a space complexity of O(n2). With these characteristics the architecture is particularly suited to efficiently solve medium-sized LSEs as they for example appear in the cryptanalysis of certain stream cipher classes. Moreover, we propose a hardware-optimized algorithm for matrix-by-matrix multiplication over GF(2) which runs in linear time and quadratic space on a similar architecture. This opens up the possibility of building a more complex architecture for efficiently solving larger LSEs by means of Strassen's algorithm which could significantly improve the time complexity of algebraic attacks on various ciphers. As proof-of-concept we realized our architecture on a contemporary low-cost FPGA. The implementation for a 50 times 50 LSE can be clocked with a frequency of up to 300 MHz and computes the solution in 0.33 mus on average","PeriodicalId":123057,"journal":{"name":"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"52","resultStr":"{\"title\":\"A Parallel Hardware Architecture for fast Gaussian Elimination over GF(2)\",\"authors\":\"A. Bogdanov, M. Mertens, C. Paar, J. Pelzl, Andy Rupp\",\"doi\":\"10.1109/FCCM.2006.12\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a hardware-optimized variant of the well-known Gaussian elimination over GF(2) and its highly efficient implementation. The proposed hardware architecture can solve any regular and (uniquely solvable) overdetermined linear system of equations (LSE) and is not limited to matrices of a certain structure. Besides solving LSEs, the architecture at hand can also accomplish the related problem of matrix inversion extremely fast. Its average running time for n times n binary matrices with uniformly distributed entries equals 2n (clock cycles) as opposed to about frac14n3 in software. The average running time remains very close to 2n for matrices with densities much greater or lower than 0.5. The architecture has a worst-case time complexity of O(n2) and also a space complexity of O(n2). With these characteristics the architecture is particularly suited to efficiently solve medium-sized LSEs as they for example appear in the cryptanalysis of certain stream cipher classes. Moreover, we propose a hardware-optimized algorithm for matrix-by-matrix multiplication over GF(2) which runs in linear time and quadratic space on a similar architecture. This opens up the possibility of building a more complex architecture for efficiently solving larger LSEs by means of Strassen's algorithm which could significantly improve the time complexity of algebraic attacks on various ciphers. As proof-of-concept we realized our architecture on a contemporary low-cost FPGA. The implementation for a 50 times 50 LSE can be clocked with a frequency of up to 300 MHz and computes the solution in 0.33 mus on average\",\"PeriodicalId\":123057,\"journal\":{\"name\":\"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines\",\"volume\":\"70 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"52\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FCCM.2006.12\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2006.12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 52

摘要

本文提出了GF(2)上著名的高斯消去算法的一种硬件优化变体及其高效实现。所提出的硬件架构可以求解任何正则和(唯一可解的)超定线性方程组(LSE)，而不限于特定结构的矩阵。除了求解lse之外，现有的体系结构还可以极快地完成矩阵反演的相关问题。它的平均运行时间为n乘以n个具有均匀分布条目的二进制矩阵等于2n(时钟周期)，而在软件中大约为frac14n3。对于密度大大大于或低于0.5的矩阵，平均运行时间仍然非常接近2n。该体系结构的最坏情况时间复杂度为O(n2)，空间复杂度为O(n2)。有了这些特征，该架构特别适合有效地解决中型lse，例如它们出现在某些流密码类的密码分析中。此外，我们提出了一个在GF(2)上的矩阵乘的硬件优化算法，该算法在线性时间和二次空间上运行在类似的架构上。这为通过Strassen算法构建更复杂的体系结构以有效解决更大的lse提供了可能性，该算法可以显着提高对各种密码的代数攻击的时间复杂度。作为概念验证，我们在当代低成本FPGA上实现了我们的架构。50 × 50 LSE的实现可以以高达300 MHz的频率进行时钟处理，平均计算时间为0.33 mus

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Parallel Hardware Architecture for fast Gaussian Elimination over GF(2)

This paper presents a hardware-optimized variant of the well-known Gaussian elimination over GF(2) and its highly efficient implementation. The proposed hardware architecture can solve any regular and (uniquely solvable) overdetermined linear system of equations (LSE) and is not limited to matrices of a certain structure. Besides solving LSEs, the architecture at hand can also accomplish the related problem of matrix inversion extremely fast. Its average running time for n times n binary matrices with uniformly distributed entries equals 2n (clock cycles) as opposed to about frac14n3 in software. The average running time remains very close to 2n for matrices with densities much greater or lower than 0.5. The architecture has a worst-case time complexity of O(n2) and also a space complexity of O(n2). With these characteristics the architecture is particularly suited to efficiently solve medium-sized LSEs as they for example appear in the cryptanalysis of certain stream cipher classes. Moreover, we propose a hardware-optimized algorithm for matrix-by-matrix multiplication over GF(2) which runs in linear time and quadratic space on a similar architecture. This opens up the possibility of building a more complex architecture for efficiently solving larger LSEs by means of Strassen's algorithm which could significantly improve the time complexity of algebraic attacks on various ciphers. As proof-of-concept we realized our architecture on a contemporary low-cost FPGA. The implementation for a 50 times 50 LSE can be clocked with a frequency of up to 300 MHz and computes the solution in 0.33 mus on average

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines

自引率

0.00%

发文量