{"title":"Finding Space-Time Stream Permutations for Minimum Memory and Latency","authors":"Thaddeus Koehn, P. Athanas","doi":"10.1109/FCCM.2016.54","DOIUrl":null,"url":null,"abstract":"Processing of parallel data streams requires permutation units for many algorithms where the streams are not independent. Such algorithms include transforms, multi-rate signal processing, and Viterbi decoding. The absolute order of data elements from the permutation is not important, only that data elements are located correctly for the next processing step. This paper describes a method to find permutations that require a minimum amount of memory and latency. The required permutations are generated based on the data dependencies of a computation set. Additional constraints are imposed so that the parallel streaming architecture processes the data without flow control. Results show agreement with brute force methods, which become computationally infeasible for large permutation sets.","PeriodicalId":113498,"journal":{"name":"2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","volume":"30 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2016.54","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Processing of parallel data streams requires permutation units for many algorithms where the streams are not independent. Such algorithms include transforms, multi-rate signal processing, and Viterbi decoding. The absolute order of data elements from the permutation is not important, only that data elements are located correctly for the next processing step. This paper describes a method to find permutations that require a minimum amount of memory and latency. The required permutations are generated based on the data dependencies of a computation set. Additional constraints are imposed so that the parallel streaming architecture processes the data without flow control. Results show agreement with brute force methods, which become computationally infeasible for large permutation sets.