{"title":"An efficient architecture of truncated booth multiplier for AI application","authors":"Shareefa Fairoose P. , Ashutosh Mishra","doi":"10.1016/j.vlsi.2025.102544","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents truncated approximate booth multipliers (TR-ABMs) that are both energy- and area-efficient, leveraging novel Leading One/Zero Position Detectors (LOZPDs) and optimized approximate booth multiplier (ABM) architectures. The proposed <span>LOZPD</span>s enable substantial reductions in Area-Delay Product (<span>ADP</span>) and Power-Delay Product (<span>PDP</span>) relative to existing techniques. Two architectures are introduced: <span>TR-ABM1</span> integrates <span>LOZPD</span>-based operand truncation with a conventional Booth multiplier, while <span>TR-ABM2</span> employs an approximate Booth multiplier variant for further efficiency gains. The level of approximation is tunable through key design parameters, including multiplier width (<span><math><mi>w</mi></math></span>) and the number of partial product columns utilizing Approximate Partial Product Generators (<span>APG</span>s) and Approximate Compressors (<span>AC</span>s). Comprehensive error analysis is conducted via Monte Carlo simulations with 10 million random inputs, and the designs are synthesized using Cadence® Genus in 90 nm CMOS technology. The <span>TR-ABM</span>s are evaluated in neural network (<span>NN</span>) inference and 64-point Fast Fourier Transform (<span>FFT64</span>) applications. For MNIST handwritten digit classification, the <span>TR-ABM</span>s achieve accuracy on par with exact fixed-point Booth multipliers. In <span>FFT64</span>, the proposed designs deliver significant area and power savings over state-of-the-art approximate multipliers. Specifically, the <span>TR-ABM</span>s achieve 66.58%–75.59% reductions in <span>ADP</span> and 47.91%–60.94% reductions in <span>PDP</span>, while maintaining reliable computational accuracy. Overall, the <span>TR-ABM</span>s offer a superior accuracy-performance trade-off compared to prior approximate multipliers, making them highly suitable for energy-efficient artificial intelligence and signal processing applications.</div></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":"106 ","pages":"Article 102544"},"PeriodicalIF":2.5000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Integration-The Vlsi Journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167926025002019","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents truncated approximate booth multipliers (TR-ABMs) that are both energy- and area-efficient, leveraging novel Leading One/Zero Position Detectors (LOZPDs) and optimized approximate booth multiplier (ABM) architectures. The proposed LOZPDs enable substantial reductions in Area-Delay Product (ADP) and Power-Delay Product (PDP) relative to existing techniques. Two architectures are introduced: TR-ABM1 integrates LOZPD-based operand truncation with a conventional Booth multiplier, while TR-ABM2 employs an approximate Booth multiplier variant for further efficiency gains. The level of approximation is tunable through key design parameters, including multiplier width () and the number of partial product columns utilizing Approximate Partial Product Generators (APGs) and Approximate Compressors (ACs). Comprehensive error analysis is conducted via Monte Carlo simulations with 10 million random inputs, and the designs are synthesized using Cadence® Genus in 90 nm CMOS technology. The TR-ABMs are evaluated in neural network (NN) inference and 64-point Fast Fourier Transform (FFT64) applications. For MNIST handwritten digit classification, the TR-ABMs achieve accuracy on par with exact fixed-point Booth multipliers. In FFT64, the proposed designs deliver significant area and power savings over state-of-the-art approximate multipliers. Specifically, the TR-ABMs achieve 66.58%–75.59% reductions in ADP and 47.91%–60.94% reductions in PDP, while maintaining reliable computational accuracy. Overall, the TR-ABMs offer a superior accuracy-performance trade-off compared to prior approximate multipliers, making them highly suitable for energy-efficient artificial intelligence and signal processing applications.
期刊介绍:
Integration''s aim is to cover every aspect of the VLSI area, with an emphasis on cross-fertilization between various fields of science, and the design, verification, test and applications of integrated circuits and systems, as well as closely related topics in process and device technologies. Individual issues will feature peer-reviewed tutorials and articles as well as reviews of recent publications. The intended coverage of the journal can be assessed by examining the following (non-exclusive) list of topics:
Specification methods and languages; Analog/Digital Integrated Circuits and Systems; VLSI architectures; Algorithms, methods and tools for modeling, simulation, synthesis and verification of integrated circuits and systems of any complexity; Embedded systems; High-level synthesis for VLSI systems; Logic synthesis and finite automata; Testing, design-for-test and test generation algorithms; Physical design; Formal verification; Algorithms implemented in VLSI systems; Systems engineering; Heterogeneous systems.