A Highly Customizable and Efficient Hardware Implementation for Parallel Matrix Inversion

2022 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2022-12-05 DOI:10.1109/ICFPT56656.2022.9974569

Sultan S. Alqahtani, Yiqun Zhu, Qizhi Shi, Xiaolin Meng, Xinhua Wang

{"title":"A Highly Customizable and Efficient Hardware Implementation for Parallel Matrix Inversion","authors":"Sultan S. Alqahtani, Yiqun Zhu, Qizhi Shi, Xiaolin Meng, Xinhua Wang","doi":"10.1109/ICFPT56656.2022.9974569","DOIUrl":null,"url":null,"abstract":"This paper introduces an efficient and customizable FPGA-based architecture for parallel matrix inversion. The capability of the proposed customizable architecture to adapt to different matrix sizes with low latency and effective resource utilization is achieved. The hardware resource usage is optimized by re-using the same multiplication units for different calculations. The architecture uses multiple multiplication units in parallel to perform the normalization step and then re-uses them for the elimination step. The performance of the proposed architecture is enhanced by maximizing parallelism and minimizing the sequential execution time of the division unit. Compared with other related works, the implementation results show that the proposed architecture is sufficiently flexible to support different matrix sizes with high parallel computing power. Additionally, the number of clock cycles and multiplication units of the proposed architecture is reduced proportionally to the increase in matrix size. The proposed architecture has been optimized for a Zynq xc7z045 FPGA and implemented using both single and double- precision floating-point representations.","PeriodicalId":239314,"journal":{"name":"2022 International Conference on Field-Programmable Technology (ICFPT)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Field-Programmable Technology (ICFPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFPT56656.2022.9974569","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This paper introduces an efficient and customizable FPGA-based architecture for parallel matrix inversion. The capability of the proposed customizable architecture to adapt to different matrix sizes with low latency and effective resource utilization is achieved. The hardware resource usage is optimized by re-using the same multiplication units for different calculations. The architecture uses multiple multiplication units in parallel to perform the normalization step and then re-uses them for the elimination step. The performance of the proposed architecture is enhanced by maximizing parallelism and minimizing the sequential execution time of the division unit. Compared with other related works, the implementation results show that the proposed architecture is sufficiently flexible to support different matrix sizes with high parallel computing power. Additionally, the number of clock cycles and multiplication units of the proposed architecture is reduced proportionally to the increase in matrix size. The proposed architecture has been optimized for a Zynq xc7z045 FPGA and implemented using both single and double- precision floating-point representations.

查看原文本刊更多论文

一个高度可定制和高效的并行矩阵反演硬件实现

本文介绍了一种高效、可定制的基于fpga的并行矩阵反演体系结构。所提出的可定制架构具有适应不同矩阵大小的能力，具有低延迟和有效的资源利用率。通过在不同的计算中重用相同的乘法单元，可以优化硬件资源的使用。该体系结构并行地使用多个乘法单元来执行规范化步骤，然后在消除步骤中重用它们。通过最大化并行性和最小化除法单元的顺序执行时间来增强所提出的体系结构的性能。与其他相关工作相比，实现结果表明，该架构具有足够的灵活性，可以支持不同大小的矩阵，具有较高的并行计算能力。此外，所提出的架构的时钟周期和乘法单元的数量与矩阵大小的增加成比例地减少。所提出的架构已针对Zynq xc7z045 FPGA进行了优化，并使用单精度和双精度浮点表示实现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 International Conference on Field-Programmable Technology (ICFPT)

自引率

0.00%

发文量