Design of area-speed efficient Anurupyena Vedic multiplier for deep learning applications

IF 1.2 4区工程技术 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Analog Integrated Circuits and Signal Processing Pub Date : 2024-02-09 DOI:10.1007/s10470-024-02255-2

C. M. Kalaiselvi, R. S. Sabeenian

{"title":"Design of area-speed efficient Anurupyena Vedic multiplier for deep learning applications","authors":"C. M. Kalaiselvi, R. S. Sabeenian","doi":"10.1007/s10470-024-02255-2","DOIUrl":null,"url":null,"abstract":"<div><p>Hardware such as multipliers and dividers is necessary for all electronic systems. This paper explores Vedic mathematics techniques for high-speed and low-area multiplication. In the study of multiplication algorithms, various bits-width ranges of the Anurupyena sutra are used. Parallelism is employed to address challenging problems in recent studies. Various designs have been developed for the Field Programmable Gate Array (FPGA) implementation employing Very Large-Scale integration (VLSI) design approaches and parallel computing technology. Signal processing, machine learning, and reconfigurable computing research should be closely monitored as artificial intelligence develops. Multipliers and adders are key components of deep learning algorithms. The multiplier is an energy-intensive component of signal processing in Arithmetic Logic Unit (ALU), Convolutional Neural Networks (CNN), and Deep Neural Networks (DNN). For the DNN, this method introduces the Booth multiplier blocks and the carry-save multiplier in the Anurupyena architecture. Traditional multiplication methods like the array multiplier, Wallace multiplier, and Booth multiplier are contrasted with the Vedic mathematics algorithms. On a specific hardware platform, Vedic algorithms perform faster, use less power, and take up less space. Implementations were carried out using Verilog HDL and Xilinx Vivado 2019.1 on Kintex-7. The area and propagation delay were reduced compared to other multiplier architectures.</p></div>","PeriodicalId":7827,"journal":{"name":"Analog Integrated Circuits and Signal Processing","volume":"119 3","pages":"521 - 533"},"PeriodicalIF":1.2000,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analog Integrated Circuits and Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://link.springer.com/article/10.1007/s10470-024-02255-2","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Hardware such as multipliers and dividers is necessary for all electronic systems. This paper explores Vedic mathematics techniques for high-speed and low-area multiplication. In the study of multiplication algorithms, various bits-width ranges of the Anurupyena sutra are used. Parallelism is employed to address challenging problems in recent studies. Various designs have been developed for the Field Programmable Gate Array (FPGA) implementation employing Very Large-Scale integration (VLSI) design approaches and parallel computing technology. Signal processing, machine learning, and reconfigurable computing research should be closely monitored as artificial intelligence develops. Multipliers and adders are key components of deep learning algorithms. The multiplier is an energy-intensive component of signal processing in Arithmetic Logic Unit (ALU), Convolutional Neural Networks (CNN), and Deep Neural Networks (DNN). For the DNN, this method introduces the Booth multiplier blocks and the carry-save multiplier in the Anurupyena architecture. Traditional multiplication methods like the array multiplier, Wallace multiplier, and Booth multiplier are contrasted with the Vedic mathematics algorithms. On a specific hardware platform, Vedic algorithms perform faster, use less power, and take up less space. Implementations were carried out using Verilog HDL and Xilinx Vivado 2019.1 on Kintex-7. The area and propagation delay were reduced compared to other multiplier architectures.

Abstract Image

查看原文本刊更多论文

为深度学习应用设计面积速度高效的 Anurupyena Vedic 乘法器

摘要所有电子系统都需要乘法器和除法器等硬件。本文探讨了用于高速和低面积乘法的吠陀数学技术。在乘法算法的研究中，使用了《阿努鲁皮耶那经》的各种位宽范围。在最近的研究中，并行化被用来解决具有挑战性的问题。利用超大规模集成（VLSI）设计方法和并行计算技术，为现场可编程门阵列（FPGA）的实施开发了各种设计。随着人工智能的发展，应密切关注信号处理、机器学习和可重构计算研究。乘法器和加法器是深度学习算法的关键组成部分。乘法器是算术逻辑单元（ALU）、卷积神经网络（CNN）和深度神经网络（DNN）中信号处理的能耗密集型组件。对于 DNN，该方法在 Anurupyena 架构中引入了 Booth 乘法器块和进位保存乘法器。数组乘法器、华莱士乘法器和布斯乘法器等传统乘法方法与吠陀数学算法进行了对比。在特定的硬件平台上，吠陀算法的运行速度更快、功耗更低、占用空间更少。在 Kintex-7 上使用 Verilog HDL 和 Xilinx Vivado 2019.1 进行了实现。与其他乘法器架构相比，面积和传播延迟都有所减少。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Analog Integrated Circuits and Signal Processing 工程技术-工程：电子与电气

CiteScore

0.30

自引率

7.10%

发文量

141

审稿时长

7.3 months

期刊介绍： Analog Integrated Circuits and Signal Processing is an archival peer reviewed journal dedicated to the design and application of analog, radio frequency (RF), and mixed signal integrated circuits (ICs) as well as signal processing circuits and systems. It features both new research results and tutorial views and reflects the large volume of cutting-edge research activity in the worldwide field today. A partial list of topics includes analog and mixed signal interface circuits and systems; analog and RFIC design; data converters; active-RC, switched-capacitor, and continuous-time integrated filters; mixed analog/digital VLSI systems; wireless radio transceivers; clock and data recovery circuits; and high speed optoelectronic circuits and systems.