{"title":"Automating structured matrix-matrix multiplication for stream processing","authors":"Thaddeus Koehn, P. Athanas","doi":"10.1109/ReConFig.2016.7857158","DOIUrl":null,"url":null,"abstract":"Structured matrices in which at least one element is known to always be zero commonly appear in a variety of applications, including Markov processes, MIMO communications, and eigenvalue decomposition. Since matrices with known zeros require fewer computations, generating hardware to take advantage of this allows increased throughput. The approach in this paper can generate hardware for anything ranging from very sparse to completely full matrices. When dense (all elements non-zero) matrix multiplication hardware is generated, throughput is comparable to commercially available generators. As sparsity increases, throughput improves proportionally. This method also achieves a shorter processing delay compared with other techniques for sparse matrices.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ReConFig.2016.7857158","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Structured matrices in which at least one element is known to always be zero commonly appear in a variety of applications, including Markov processes, MIMO communications, and eigenvalue decomposition. Since matrices with known zeros require fewer computations, generating hardware to take advantage of this allows increased throughput. The approach in this paper can generate hardware for anything ranging from very sparse to completely full matrices. When dense (all elements non-zero) matrix multiplication hardware is generated, throughput is comparable to commercially available generators. As sparsity increases, throughput improves proportionally. This method also achieves a shorter processing delay compared with other techniques for sparse matrices.