{"title":"Optimal Circuits for Streamed Linear Permutations Using RAM","authors":"F. Serre, Thomas Holenstein, Markus Püschel","doi":"10.1145/2847263.2847277","DOIUrl":null,"url":null,"abstract":"We propose a method to automatically derive hardware structures that perform a fixed linear permutation on streaming data. Linear permutations are permutations that map linearly the bit representation of the elements addresses. This set contains many of the most important permutations in media processing, communication, and other applications and includes perfect shuffles, stride permutations, and the bit reversal. Streaming means that the data to be permuted arrive as a sequence of chunks over several cycles. We solve this problem by mathematically decomposing a given permutation into a sequence of three permutations that are either temporal or spatial. The former are implemented as banks of RAM, the latter as switching networks. We prove optimality of our solution in terms of the number of switches in these networks.","PeriodicalId":438572,"journal":{"name":"Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2847263.2847277","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
We propose a method to automatically derive hardware structures that perform a fixed linear permutation on streaming data. Linear permutations are permutations that map linearly the bit representation of the elements addresses. This set contains many of the most important permutations in media processing, communication, and other applications and includes perfect shuffles, stride permutations, and the bit reversal. Streaming means that the data to be permuted arrive as a sequence of chunks over several cycles. We solve this problem by mathematically decomposing a given permutation into a sequence of three permutations that are either temporal or spatial. The former are implemented as banks of RAM, the latter as switching networks. We prove optimality of our solution in terms of the number of switches in these networks.