{"title":"大矩阵奇异值分解的可扩展并行结构","authors":"Unai Martinez-Corral, Koldo Basterretxea, Raul Finker","doi":"10.1109/FPL.2014.6927393","DOIUrl":null,"url":null,"abstract":"Singular Value Decomposition (SVD) is a key linear algebraic operation in many scientific and engineering applications, many of them involving high dimensionality datasets and real-time response. In this paper we describe a scalable parallel processing architecture for accelerating the SVD of large m × n matrices. Based on a linear array of simple processing-units (PUs), the proposed architecture follows a double data-flow paradigm (FIFO memories and a shared-bus) for optimizing the time spent in data transferences. The PUs, which perform elemental column-pair evaluations and rotations, have been designed for an efficient utilization of available FPGA resources and to achieve maximum algorithm speed-ups. The architecture is fully scalable from a two-PU scheme to an arrangement with as many as n/2 PUs. This allows for a trade-off between occupied area and processing acceleration in the final implementation, and permits the SVD processor to be implemented both on low-cost and high-end FPGAs. The system has been prototyped on Spartan-6 and Kintex-7 devices for performance comparison.","PeriodicalId":172795,"journal":{"name":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","volume":"257 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Scalable parallel architecture for singular value decomposition of large matrices\",\"authors\":\"Unai Martinez-Corral, Koldo Basterretxea, Raul Finker\",\"doi\":\"10.1109/FPL.2014.6927393\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Singular Value Decomposition (SVD) is a key linear algebraic operation in many scientific and engineering applications, many of them involving high dimensionality datasets and real-time response. In this paper we describe a scalable parallel processing architecture for accelerating the SVD of large m × n matrices. Based on a linear array of simple processing-units (PUs), the proposed architecture follows a double data-flow paradigm (FIFO memories and a shared-bus) for optimizing the time spent in data transferences. The PUs, which perform elemental column-pair evaluations and rotations, have been designed for an efficient utilization of available FPGA resources and to achieve maximum algorithm speed-ups. The architecture is fully scalable from a two-PU scheme to an arrangement with as many as n/2 PUs. This allows for a trade-off between occupied area and processing acceleration in the final implementation, and permits the SVD processor to be implemented both on low-cost and high-end FPGAs. The system has been prototyped on Spartan-6 and Kintex-7 devices for performance comparison.\",\"PeriodicalId\":172795,\"journal\":{\"name\":\"2014 24th International Conference on Field Programmable Logic and Applications (FPL)\",\"volume\":\"257 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 24th International Conference on Field Programmable Logic and Applications (FPL)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FPL.2014.6927393\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 24th International Conference on Field Programmable Logic and Applications (FPL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FPL.2014.6927393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Scalable parallel architecture for singular value decomposition of large matrices
Singular Value Decomposition (SVD) is a key linear algebraic operation in many scientific and engineering applications, many of them involving high dimensionality datasets and real-time response. In this paper we describe a scalable parallel processing architecture for accelerating the SVD of large m × n matrices. Based on a linear array of simple processing-units (PUs), the proposed architecture follows a double data-flow paradigm (FIFO memories and a shared-bus) for optimizing the time spent in data transferences. The PUs, which perform elemental column-pair evaluations and rotations, have been designed for an efficient utilization of available FPGA resources and to achieve maximum algorithm speed-ups. The architecture is fully scalable from a two-PU scheme to an arrangement with as many as n/2 PUs. This allows for a trade-off between occupied area and processing acceleration in the final implementation, and permits the SVD processor to be implemented both on low-cost and high-end FPGAs. The system has been prototyped on Spartan-6 and Kintex-7 devices for performance comparison.