{"title":"基于多粒度设计和并行加速的可重构高性能乘法器","authors":"Feng Jing, Zijun Liu, Xiaojun Ma, Guo Yang, Guo Peng, Donglin Wang","doi":"10.1109/ICSESS.2017.8342979","DOIUrl":null,"url":null,"abstract":"This paper proposes a reconfigurable high performance multiplier (RHPM) based on multi-granularity design and parallel acceleration. Capable of supporting multiple precisions for different processing requirements, the RHPM can perform one 32×32, two 16×16, or four 8×8 bit unsigned/signed multiplication, or one 16×16, or two 8×8 bit complex number multiplication. The structures of the partial product generator and the partial product accumulator are improved in the paper, so as to reuse most of the hardware resources. Compression can be completed automatically by means of recording the validity of every bit in the partial product array which accelerates the computation dramatically. The RHPM is implemented with TSMC 28nm technology, exhibiting a 0.68s of the critical path delay, while consuming only 0.6281mW in power. Results show its significant superiority in terms of performance and power efficiency compared with our previous work or other similar products.","PeriodicalId":179815,"journal":{"name":"2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A reconfigurable high-performance multiplier based on multi-granularity design and parallel acceleration\",\"authors\":\"Feng Jing, Zijun Liu, Xiaojun Ma, Guo Yang, Guo Peng, Donglin Wang\",\"doi\":\"10.1109/ICSESS.2017.8342979\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes a reconfigurable high performance multiplier (RHPM) based on multi-granularity design and parallel acceleration. Capable of supporting multiple precisions for different processing requirements, the RHPM can perform one 32×32, two 16×16, or four 8×8 bit unsigned/signed multiplication, or one 16×16, or two 8×8 bit complex number multiplication. The structures of the partial product generator and the partial product accumulator are improved in the paper, so as to reuse most of the hardware resources. Compression can be completed automatically by means of recording the validity of every bit in the partial product array which accelerates the computation dramatically. The RHPM is implemented with TSMC 28nm technology, exhibiting a 0.68s of the critical path delay, while consuming only 0.6281mW in power. Results show its significant superiority in terms of performance and power efficiency compared with our previous work or other similar products.\",\"PeriodicalId\":179815,\"journal\":{\"name\":\"2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSESS.2017.8342979\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSESS.2017.8342979","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A reconfigurable high-performance multiplier based on multi-granularity design and parallel acceleration
This paper proposes a reconfigurable high performance multiplier (RHPM) based on multi-granularity design and parallel acceleration. Capable of supporting multiple precisions for different processing requirements, the RHPM can perform one 32×32, two 16×16, or four 8×8 bit unsigned/signed multiplication, or one 16×16, or two 8×8 bit complex number multiplication. The structures of the partial product generator and the partial product accumulator are improved in the paper, so as to reuse most of the hardware resources. Compression can be completed automatically by means of recording the validity of every bit in the partial product array which accelerates the computation dramatically. The RHPM is implemented with TSMC 28nm technology, exhibiting a 0.68s of the critical path delay, while consuming only 0.6281mW in power. Results show its significant superiority in terms of performance and power efficiency compared with our previous work or other similar products.