大矩阵奇异值分解的可扩展并行结构

2014 24th International Conference on Field Programmable Logic and Applications (FPL) Pub Date : 2014-10-20 DOI:10.1109/FPL.2014.6927393

Unai Martinez-Corral, Koldo Basterretxea, Raul Finker

{"title":"大矩阵奇异值分解的可扩展并行结构","authors":"Unai Martinez-Corral, Koldo Basterretxea, Raul Finker","doi":"10.1109/FPL.2014.6927393","DOIUrl":null,"url":null,"abstract":"Singular Value Decomposition (SVD) is a key linear algebraic operation in many scientific and engineering applications, many of them involving high dimensionality datasets and real-time response. In this paper we describe a scalable parallel processing architecture for accelerating the SVD of large m × n matrices. Based on a linear array of simple processing-units (PUs), the proposed architecture follows a double data-flow paradigm (FIFO memories and a shared-bus) for optimizing the time spent in data transferences. The PUs, which perform elemental column-pair evaluations and rotations, have been designed for an efficient utilization of available FPGA resources and to achieve maximum algorithm speed-ups. The architecture is fully scalable from a two-PU scheme to an arrangement with as many as n/2 PUs. This allows for a trade-off between occupied area and processing acceleration in the final implementation, and permits the SVD processor to be implemented both on low-cost and high-end FPGAs. The system has been prototyped on Spartan-6 and Kintex-7 devices for performance comparison.","PeriodicalId":172795,"journal":{"name":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","volume":"257 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Scalable parallel architecture for singular value decomposition of large matrices\",\"authors\":\"Unai Martinez-Corral, Koldo Basterretxea, Raul Finker\",\"doi\":\"10.1109/FPL.2014.6927393\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Singular Value Decomposition (SVD) is a key linear algebraic operation in many scientific and engineering applications, many of them involving high dimensionality datasets and real-time response. In this paper we describe a scalable parallel processing architecture for accelerating the SVD of large m × n matrices. Based on a linear array of simple processing-units (PUs), the proposed architecture follows a double data-flow paradigm (FIFO memories and a shared-bus) for optimizing the time spent in data transferences. The PUs, which perform elemental column-pair evaluations and rotations, have been designed for an efficient utilization of available FPGA resources and to achieve maximum algorithm speed-ups. The architecture is fully scalable from a two-PU scheme to an arrangement with as many as n/2 PUs. This allows for a trade-off between occupied area and processing acceleration in the final implementation, and permits the SVD processor to be implemented both on low-cost and high-end FPGAs. The system has been prototyped on Spartan-6 and Kintex-7 devices for performance comparison.\",\"PeriodicalId\":172795,\"journal\":{\"name\":\"2014 24th International Conference on Field Programmable Logic and Applications (FPL)\",\"volume\":\"257 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 24th International Conference on Field Programmable Logic and Applications (FPL)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FPL.2014.6927393\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPL.2014.6927393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

奇异值分解(SVD)在许多科学和工程应用中是一个关键的线性代数运算，其中许多涉及高维数据集和实时响应。本文描述了一种可扩展的并行处理体系结构，用于加速m × n大矩阵的奇异值分解。基于简单处理单元(pu)的线性阵列，所提出的架构遵循双数据流范式(FIFO存储器和共享总线)，以优化数据传输所花费的时间。执行元素列对评估和旋转的pu被设计为有效利用可用的FPGA资源并实现最大的算法加速。该架构完全可以从两个pu方案扩展到多达n/2个pu的安排。这允许在最终实现中在占用面积和处理加速之间进行权衡，并允许SVD处理器在低成本和高端fpga上实现。该系统已在Spartan-6和Kintex-7设备上进行了原型测试，以进行性能比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Scalable parallel architecture for singular value decomposition of large matrices

Singular Value Decomposition (SVD) is a key linear algebraic operation in many scientific and engineering applications, many of them involving high dimensionality datasets and real-time response. In this paper we describe a scalable parallel processing architecture for accelerating the SVD of large m × n matrices. Based on a linear array of simple processing-units (PUs), the proposed architecture follows a double data-flow paradigm (FIFO memories and a shared-bus) for optimizing the time spent in data transferences. The PUs, which perform elemental column-pair evaluations and rotations, have been designed for an efficient utilization of available FPGA resources and to achieve maximum algorithm speed-ups. The architecture is fully scalable from a two-PU scheme to an arrangement with as many as n/2 PUs. This allows for a trade-off between occupied area and processing acceleration in the final implementation, and permits the SVD processor to be implemented both on low-cost and high-end FPGAs. The system has been prototyped on Spartan-6 and Kintex-7 devices for performance comparison.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 24th International Conference on Field Programmable Logic and Applications (FPL)

自引率

0.00%

发文量