Design and implementation of 16 bit systolic multiplier using modular shifting algorithm

2016 Second International Conference on Science Technology Engineering and Management (ICONSTEM) Pub Date : 2016-03-01 DOI:10.1109/ICONSTEM.2016.7560950

S. Jayarajkumar, Kaliannan Sivanandam

{"title":"Design and implementation of 16 bit systolic multiplier using modular shifting algorithm","authors":"S. Jayarajkumar, Kaliannan Sivanandam","doi":"10.1109/ICONSTEM.2016.7560950","DOIUrl":null,"url":null,"abstract":"The finite field multipliers consuming high-throughput rate and low-latency having grown excessive attention in recent cryptographic systems, and coding theory but such multipliers above Galois field GF(2m) for National institute standard technology (NIST) pentanomials are not so plentiful. We introduce two pairs of low latency and high throughput bit-parallel and digit-serial systolic multipliers depends on NIST pentanomials. We propose a unique decomposition technique to recognize the multiplication by several parallel arrays in a two-dimensional (2-D) systolic structure (BP-I) with a critical-path of 2Tx, where Tx is the propagation delay of XOR gate. The parallel arrays in two dimensional systolic structure are estimated along the vertical direction to attain a proposed 16-bit digit-serial structure (PDS-I) with the same critical-path. Designed for high-throughput applications, we proposed another pair of bit-parallel (BP-II) and Modified 16 bit digit-serial (PDS-II) structures based on a unique modular reduction method, where the critical-path is reduced to(Ta+Tx), Ta is an propagation delay of AND gate. The steps for data sharing between a pair of processing elements (PEs) of adjacent systolic arrays has been suggested to reduce the area-complexity of BP-I and BP-II advance. The existing method consumes more power and high area overhead. In systolic multiplier used to reduce area and power for the ASIC implementations and is also reduce the average computation time. Systolic multiplier is a better choice for high-speed VLSI implementation.","PeriodicalId":256750,"journal":{"name":"2016 Second International Conference on Science Technology Engineering and Management (ICONSTEM)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Second International Conference on Science Technology Engineering and Management (ICONSTEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICONSTEM.2016.7560950","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

The finite field multipliers consuming high-throughput rate and low-latency having grown excessive attention in recent cryptographic systems, and coding theory but such multipliers above Galois field GF(2m) for National institute standard technology (NIST) pentanomials are not so plentiful. We introduce two pairs of low latency and high throughput bit-parallel and digit-serial systolic multipliers depends on NIST pentanomials. We propose a unique decomposition technique to recognize the multiplication by several parallel arrays in a two-dimensional (2-D) systolic structure (BP-I) with a critical-path of 2Tx, where Tx is the propagation delay of XOR gate. The parallel arrays in two dimensional systolic structure are estimated along the vertical direction to attain a proposed 16-bit digit-serial structure (PDS-I) with the same critical-path. Designed for high-throughput applications, we proposed another pair of bit-parallel (BP-II) and Modified 16 bit digit-serial (PDS-II) structures based on a unique modular reduction method, where the critical-path is reduced to(Ta+Tx), Ta is an propagation delay of AND gate. The steps for data sharing between a pair of processing elements (PEs) of adjacent systolic arrays has been suggested to reduce the area-complexity of BP-I and BP-II advance. The existing method consumes more power and high area overhead. In systolic multiplier used to reduce area and power for the ASIC implementations and is also reduce the average computation time. Systolic multiplier is a better choice for high-speed VLSI implementation.

查看原文本刊更多论文

基于模移位算法的16位收缩乘法器的设计与实现

高吞吐率和低延迟的有限域乘法器在最近的密码系统和编码理论中受到了广泛的关注，但这种在伽罗瓦域GF(2m)以上的国家研究所标准技术(NIST)五异常的乘法器并不多见。我们介绍了两对低延迟和高吞吐量的位并行和数字串行收缩乘法器依赖于NIST五异常。我们提出了一种独特的分解技术来识别二维(2-D)收缩结构(BP-I)中多个并行阵列的乘法，其关键路径为2Tx，其中Tx为异或门的传播延迟。对二维收缩结构中的平行阵列沿垂直方向进行估计，得到具有相同关键路径的16位数字串行结构(PDS-I)。针对高吞吐量应用，我们提出了另一种基于独特的模块化约简方法的位并行(BP-II)和改进的16位数字串行(PDS-II)结构，其中关键路径约简为(Ta+Tx)， Ta为与门的传播延迟。为了降低BP-I和BP-II推进的面积复杂度，建议在相邻收缩阵列的一对处理单元(PEs)之间共享数据。现有方法功耗大，面积占用大。收缩乘法器用于减少ASIC实现的面积和功耗，同时也减少了平均计算时间。收缩乘法器是实现高速VLSI的较好选择。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 Second International Conference on Science Technology Engineering and Management (ICONSTEM)

自引率

0.00%

发文量