Jun Luo, Qijun Huang, Sheng Chang, Xiaoying Song, Yun Shang
{"title":"High throughput Cholesky decomposition based on FPGA","authors":"Jun Luo, Qijun Huang, Sheng Chang, Xiaoying Song, Yun Shang","doi":"10.1109/CISP.2013.6743941","DOIUrl":null,"url":null,"abstract":"Cholesky decomposition has wide applications in solving many engineering and scientific problems. Acceleration is an important issue in many of these problems. In this paper, a hardware-based LLT Cholesky decomposition featuring high throughput has been presented to solve wiener filtering based on the minimum square error criterion. To achieve the best efficiency, the hardware-based implementation has been realized by fixed-point multiple structures and various pipeline stages. Parallel properties have been exploited to improve the throughput. Results have shown that a significant speedup has been achieved compared to the software-based approach.","PeriodicalId":442320,"journal":{"name":"2013 6th International Congress on Image and Signal Processing (CISP)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 6th International Congress on Image and Signal Processing (CISP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISP.2013.6743941","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Cholesky decomposition has wide applications in solving many engineering and scientific problems. Acceleration is an important issue in many of these problems. In this paper, a hardware-based LLT Cholesky decomposition featuring high throughput has been presented to solve wiener filtering based on the minimum square error criterion. To achieve the best efficiency, the hardware-based implementation has been realized by fixed-point multiple structures and various pipeline stages. Parallel properties have been exploited to improve the throughput. Results have shown that a significant speedup has been achieved compared to the software-based approach.