{"title":"Experiments with High Speed Parallel Cubing Units","authors":"Son Bui, J. Stine, M. Sadeghian","doi":"10.1109/ISVLSI.2014.97","DOIUrl":null,"url":null,"abstract":"This paper discusses modification to algorithms for computing within a parallel cubing unit. The algorithms discussed in this paper shows several architectures for various operand sizes ranging from 8 to 32 bits. The method proposed in this paper separates the cubing partial product matrix into smaller elements and organizes these partial products into repeatable manageable groups. Consequently, the overall partial product matrix is substantially reduced from previous methods. An algorithmic analysis is also presented that demonstrates reduction in area and delay for several operand widths as well as their implementations in a Vitex 5 Xilinx FPGAs and for IBM 65nm ASIC standard-cell library.","PeriodicalId":405755,"journal":{"name":"2014 IEEE Computer Society Annual Symposium on VLSI","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE Computer Society Annual Symposium on VLSI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISVLSI.2014.97","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
This paper discusses modification to algorithms for computing within a parallel cubing unit. The algorithms discussed in this paper shows several architectures for various operand sizes ranging from 8 to 32 bits. The method proposed in this paper separates the cubing partial product matrix into smaller elements and organizes these partial products into repeatable manageable groups. Consequently, the overall partial product matrix is substantially reduced from previous methods. An algorithmic analysis is also presented that demonstrates reduction in area and delay for several operand widths as well as their implementations in a Vitex 5 Xilinx FPGAs and for IBM 65nm ASIC standard-cell library.