IEEE Transactions on Very Large Scale Integration (VLSI) Systems最新文献

筛选
英文 中文
A 22-nm All-Digital Time-Domain Neural Network Accelerator for Precision In-Sensor Processing 用于精密传感器处理的22纳米全数字时域神经网络加速器
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-11-19 DOI: 10.1109/TVLSI.2024.3496090
Ahmed M. Mohey;Jelin Leslin;Gaurav Singh;Marko Kosunen;Jussi Ryynänen;Martin Andraud
{"title":"A 22-nm All-Digital Time-Domain Neural Network Accelerator for Precision In-Sensor Processing","authors":"Ahmed M. Mohey;Jelin Leslin;Gaurav Singh;Marko Kosunen;Jussi Ryynänen;Martin Andraud","doi":"10.1109/TVLSI.2024.3496090","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3496090","url":null,"abstract":"Deep neural network (DNN) accelerators are increasingly integrated into sensing applications, such as wearables and sensor networks, to provide advanced in-sensor processing capabilities. Given wearables’ strict size and power requirements, minimizing the area and energy consumption of DNN accelerators is a critical concern. In that regard, computing DNN models in the time domain is a promising architecture, taking advantage of both technology scaling friendliness and efficiency. Yet, time-domain accelerators are typically not fully digital, limiting the full benefits of time-domain computation. In this work, we propose an all-digital time-domain accelerator with a small size and low energy consumption to target precision in-sensor processing like human activity recognition (HAR). The proposed accelerator features a simple and efficient architecture without dependencies on analog nonidealities such as leakage and charge errors. An eight-neuron layer (core computation layer) is implemented in 22-nm FD-SOI technology. The layer occupies \u0000<inline-formula> <tex-math>$70 times ,70,mu $ </tex-math></inline-formula>\u0000m while supporting multibit inputs (8-bit) and weights (8-bit) with signed accumulation up to 18 bits. The power dissipation of the computation layer is 576\u0000<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>\u0000W at 0.72-V supply and 500-MHz clock frequency achieving an average area efficiency of 24.74 GOPS/mm2 (up to 544.22 GOPS/mm2), an average energy efficiency of 0.21 TOPS/W (up to 4.63 TOPS/W), and a normalized energy efficiency of 13.46 1b-TOPS/W (up to 296.30 1b-TOPS/W).","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 12","pages":"2220-2231"},"PeriodicalIF":2.8,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142821135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comprehensive Digital Calibration for Pipelined ADCs Using Cascaded Nonlinearity Correction
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-11-19 DOI: 10.1109/TVLSI.2024.3496669
Yuguo Xiang;Dayan Zhou;Minjia Song;Danfeng Zhai;Jingchao Lan;Junyan Ren;Fan Ye
{"title":"A Comprehensive Digital Calibration for Pipelined ADCs Using Cascaded Nonlinearity Correction","authors":"Yuguo Xiang;Dayan Zhou;Minjia Song;Danfeng Zhai;Jingchao Lan;Junyan Ren;Fan Ye","doi":"10.1109/TVLSI.2024.3496669","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3496669","url":null,"abstract":"This brief presents a digital calibration for pipelined analog-to-digital converters (ADCs) utilizing the cascaded nonlinearity correction (CNC) method. By cascading three correction layers for compensating nonlinearities in different parts of pipelined ADC, it comprehensively calibrates distortion in both ADC front end and back end with a low hardware cost. In addition, this work employs a discriminative fine-tuning least-mean-square (DFT-LMS) algorithm with varying step sizes for different layers, thereby improving both the convergence speed and the accuracy. An 800-MS/s, 12-bit ring amplifier-based pipelined ADC is presented to verify the proposed calibration technique. With calibration, the SFDR has a 26.7-dB improvement at low frequency and 23.6-dB improvement at Nyquist frequency, resulting in over 6-dB improvement compared with prior-art calibration techniques. The calibration algorithm has been verified on a TSMC 28-nm CMOS process. The experimental results show that the proposed ADC calibrator has an area of <inline-formula> <tex-math>$6592~mu $ </tex-math></inline-formula>m2 and consumes 5.31 mW at 800-MHz clock rate.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"1192-1196"},"PeriodicalIF":2.8,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Static-Linearity Enhancement Techniques for Digital-to-Analog Converters Exploiting Optimal Arrangements of Unit Elements 利用单元元件优化排列的数模转换器的静态线性增强技术
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-11-18 DOI: 10.1109/TVLSI.2024.3495558
Francesco Gagliardi;Danilo Scintu;Massimo Piotto;Paolo Bruschi;Michele Dei
{"title":"Static-Linearity Enhancement Techniques for Digital-to-Analog Converters Exploiting Optimal Arrangements of Unit Elements","authors":"Francesco Gagliardi;Danilo Scintu;Massimo Piotto;Paolo Bruschi;Michele Dei","doi":"10.1109/TVLSI.2024.3495558","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3495558","url":null,"abstract":"Driven by the ongoing challenge of designing high-accuracy digital-to-analog converters (DACs) at the cost of a relatively small area occupation, optimal combination algorithms (OCAs) recently gained attention within the myriad of possible calibration techniques for DACs. OCAs show appealing properties with respect to traditional approaches such as dynamic element matching (DEM). At start-up or upon request, mismatches affecting DAC elements are measured on-chip, allowing rearrangement in the selection logic of the DAC unit elements. The newly found arrangement is, hence, used during normal operation, achieving superior linearity. As of today, several alternative OCAs have been proposed; however, designers willing to implement OCA-calibrated DACs are faced with unclear tradeoffs and insufficient design guidelines. In this work, we provide a detailed comparison of existing OCAs based on statistical behavioral simulations. Starting from this, we investigate the relationships between OCAs’ performances and circuit-level design aspects. Specifically, OCAs’ effectiveness in improving the static linearity is linked to the number of DAC bits and the accuracy of the auxiliary comparator required by every OCA. Unforeseen trends emerge, and new design considerations are suggested, fostering novel awareness on the subject of high-accuracy DAC designs enabled by OCA-based calibration techniques.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 12","pages":"2243-2256"},"PeriodicalIF":2.8,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10756519","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142821155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SMBHA: A System-Level Multicore BGV Hardware Accelerator Based on FPGA SMBHA:基于FPGA的系统级多核BGV硬件加速器
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-11-14 DOI: 10.1109/TVLSI.2024.3480997
Jia-Li Duan;Chi Zhang;Li-Hui Wang;Lei Shen
{"title":"SMBHA: A System-Level Multicore BGV Hardware Accelerator Based on FPGA","authors":"Jia-Li Duan;Chi Zhang;Li-Hui Wang;Lei Shen","doi":"10.1109/TVLSI.2024.3480997","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3480997","url":null,"abstract":"Fully homomorphic encryption (FHE) enables calculations on encrypted data and is a crucial foundation for achieving privacy computing. However, the high computation overhead restricts its widespread application. Even after algorithm and software optimization, its processing speed remains low. This article proposes the first practical system-level multicore Brakerski-Gentry-Vaikuntanathan (BGV) hardware acceleration scheme based on field-programmable gate array (FPGA). By analyzing the bottleneck of system acceleration, a hierarchical storage structure is introduced to reduce data movement. A novel 4-2 mixed-radix number theoretic transform (NTT) algorithm is proposed, allowing flexible switching between radix-4 and radix-2, with the ability to reuse twiddle factors. In addition, a reconfigurable processing element (PE) is proposed that supports all homomorphic operations of BGV. The design of this article is evaluated on Xilinx Virtex7 series FPGA, achieving a throughput of NTT/inverse NTT (INTT) up to <inline-formula> <tex-math>$14times $ </tex-math></inline-formula> higher than previous designs. Compared with simple encrypted arithmetic library (SEAL), the full system performances of homomorphic encryption (ENC), decryption (DEC), and homomorphic multiplication achieve improvements of <inline-formula> <tex-math>$13.9times $ </tex-math></inline-formula>, <inline-formula> <tex-math>$7.07times $ </tex-math></inline-formula>, and <inline-formula> <tex-math>$16.6times $ </tex-math></inline-formula>, respectively.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"546-557"},"PeriodicalIF":2.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design and Analysis of a 26–32-GHz 6-bit Passive Vector Modulation Phase Shifter for CMOS Bidirectional Transceiver
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-11-14 DOI: 10.1109/TVLSI.2024.3490618
Yechen Tian;Yutong Zhang;Junjie Gu;Hao Xu;Weitian Liu;Rui Yin;Zongming Duan;Hao Gao;Na Yan
{"title":"Design and Analysis of a 26–32-GHz 6-bit Passive Vector Modulation Phase Shifter for CMOS Bidirectional Transceiver","authors":"Yechen Tian;Yutong Zhang;Junjie Gu;Hao Xu;Weitian Liu;Rui Yin;Zongming Duan;Hao Gao;Na Yan","doi":"10.1109/TVLSI.2024.3490618","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3490618","url":null,"abstract":"This article presents a 26–32-GHz 6-bit bidirectional passive vector modulation phase shifter (PVM-PS) in 40-nm CMOS for phased array systems. The passive phase shifter comprises a center-tap transformer-based quadrature generator/combiner, two 6-bit X-type attenuators, and a differential Wilkinson power combiner/divider. The symmetric design enables bidirectional signal propagation and offers flexible system configuration. Passive switches are sized to optimize the tradeoff among gain variation, insertion loss, and linearity. The phase shifter implemented in 40 nm covers a range of 360° with 5.625° resolution and the rms phase error is between 0.4° and 1.3°. It exhibits <1-dB magnitude imbalance and <1.2° phase imbalance between forward and reverse propagation modes. Its OP1dB is above −1 dBm across the operation frequency.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"673-684"},"PeriodicalIF":2.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 16-bit 1-MS/s SAR ADC With Capacitor Mismatch Self-Calibration 具有电容失配自校准功能的16位1 ms /s SAR ADC
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-11-14 DOI: 10.1109/TVLSI.2024.3489231
Jie Ding;Fuming Liu;Kuan Deng;Zihan Zheng;Jingnan Zheng;Yongzhen Chen;Jiangfeng Wu
{"title":"A 16-bit 1-MS/s SAR ADC With Capacitor Mismatch Self-Calibration","authors":"Jie Ding;Fuming Liu;Kuan Deng;Zihan Zheng;Jingnan Zheng;Yongzhen Chen;Jiangfeng Wu","doi":"10.1109/TVLSI.2024.3489231","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3489231","url":null,"abstract":"This article introduces a successive approximation register (SAR) analog-to-digital converter (ADC) that utilizes a foreground capacitor mismatch self-calibration method. The proposed floating operation puts the uncalibrated high-bit capacitor into the floating state, preventing the sub-ADC from saturating caused by comparator static offset during the calibration process. To address the random mismatch of the LSB capacitors and improve the calibration accuracy, this article employs round-robin grouping of eight sets of LSB capacitors. In addition, a precharged bootstrapped switch is proposed to achieve high sampling linearity with low power consumption and area overhead. An anti-interference custom-designed 0.5-fF capacitor structure is suggested for binary-weighted capacitor mismatch of capacitive DAC (CDAC). Furthermore, the circuit implementation of the comparator utilized by ADC is also discussed. The prototype was fabricated in a 180-nm CMOS process with a 1.8-V supply and achieved spurious-free dynamic ranges of 108.9 and 92.38 dB at an input frequency of 1 kHz while operating at sampling rates of 100 kS/s and 1 MS/s, respectively. The prototype consumes 6.745 mW and occupies 0.91 \u0000<inline-formula> <tex-math>$text {mm}^{2}$ </tex-math></inline-formula>\u0000.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 1","pages":"10-20"},"PeriodicalIF":2.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 360° Tunable Phase Shifter With Low Phase Error Based on Bandpass Networks in 0.25- μm GaN Technology
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-11-12 DOI: 10.1109/TVLSI.2024.3489355
Hanjun Zhao;Xu Yan;Hui Chu;Xiaohua Zhu;Yongxin Guo
{"title":"A 360° Tunable Phase Shifter With Low Phase Error Based on Bandpass Networks in 0.25- μm GaN Technology","authors":"Hanjun Zhao;Xu Yan;Hui Chu;Xiaohua Zhu;Yongxin Guo","doi":"10.1109/TVLSI.2024.3489355","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3489355","url":null,"abstract":"This brief presents a 360° tunable phase shifter (PS) with low phase error in a 0.25-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>m GaN-on-SiC HEMT process. To achieve these features, the design incorporates two key innovations: a novel switched-bandpass phase-shifting cell (PSC) topology and a Q-learning-based optimization algorithm, both applied for the first time in monolithic microwave integrated circuit (MMIC) PS designs. The adverse effects of the charge trapping effect in GaN HEMT switches are mitigated by using a nonlinear equivalent circuit model. A PS prototype consisting of a fifth-order bandpass PSC and two third-order bandpass PSCs with a core area of <inline-formula> <tex-math>$1.25times 2.5$ </tex-math></inline-formula> mm2 is designed, fabricated, and measured. Experimental results demonstrate a low rms phase error of less than 7.0°, along with high power linearity characterized by an IP<inline-formula> <tex-math>$_{mathrm {1,dB}}$ </tex-math></inline-formula> of 37 dBm and an IIP3 of 48 dBm, over a frequency range from 4.1 to 5.3 GHz.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"1172-1176"},"PeriodicalIF":2.8,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online Alignment and Addition in Multiterm Floating-Point Adders
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-11-11 DOI: 10.1109/TVLSI.2024.3488966
Kosmas Alexandridis;Giorgos Dimitrakopoulos
{"title":"Online Alignment and Addition in Multiterm Floating-Point Adders","authors":"Kosmas Alexandridis;Giorgos Dimitrakopoulos","doi":"10.1109/TVLSI.2024.3488966","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3488966","url":null,"abstract":"Multiterm floating-point (FP) addition appears in vector dot-product computations, matrix multiplications, and other forms of FP data aggregation. A critical step in multiterm floating-point addition is the alignment of fractions of the FP terms before adding them. Alignment is executed serially by identifying first the maximum of all exponents and then shifting the fraction of each term according to the difference of its exponent from the maximum one. Contrary to common practice, this work proposes a new online algorithm that splits the identification of the maximum exponent, the alignment shift for each fraction, and their addition to multiple fused incremental steps that can be computed in parallel. Each fused step is implemented by a new associative operator that allows the incremental alignment and addition for arbitrary number of operands. Experimental results show that employing the proposed align-and-add operators for the implementation of multiterm floating-point adders can improve delay or save significant area and power. The achieved area and power savings range between 3% and 23% and between 4% and 26%, respectively.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"1182-1186"},"PeriodicalIF":2.8,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cost-Effective Analytical Models of Resistive Opens Defects in FinFET Technology FinFET 技术中电阻开口缺陷的成本效益分析模型
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-11-11 DOI: 10.1109/TVLSI.2024.3479068
Gustavo Aguirre;Freddy Forero;Victor Champac;Michel Renovell;Florence Azais;Mariane Comte;Jean-Marc Galliere
{"title":"Cost-Effective Analytical Models of Resistive Opens Defects in FinFET Technology","authors":"Gustavo Aguirre;Freddy Forero;Victor Champac;Michel Renovell;Florence Azais;Mariane Comte;Jean-Marc Galliere","doi":"10.1109/TVLSI.2024.3479068","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3479068","url":null,"abstract":"FinFET technology has become an attractive candidate for high-performance and power-efficient applications. However, its susceptibility to defects increases due to the complexity of the process fabrications and smaller feature sizes. This article proposes compact and low-cost analytical models to evaluate the delay increase in FinFET-based circuits due to resistive open defects. The models rely on electrical simulations to precharacterize the circuit library. Analytical expressions are developed for the three types of resistive opens that may occur in FinFET-based logic cells using multifin and multifinger structures. These types of resistive opens include: a resistive open at the drain or source of the transistors (RODS), a resistive open affecting the gate of a single transistor, and a resistive open affecting the gates of both nMOS and pMOS transistors. Compact analytical models are also developed to evaluate the delay increase due to the resistive open defects under process variations. Independent and correlated process variations are taken into account. The analytical models have been validated against SPICE electrical simulations. The proposed analytical models can be used to evaluate the detectability of resistive open defects, significantly reducing the cost of dealing with different defect sizes. Potential applications of the developed analytical models are delineated. This work allows us to have higher quality and reliable electronic products.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"841-852"},"PeriodicalIF":2.8,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RCU- 2m: A VLSI Radix- 2m Cubic Unit
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-11-08 DOI: 10.1109/TVLSI.2024.3486237
Eduardo Antonio Ceśar da Costa;Morgana Macedo Azevedo da Rosa
{"title":"RCU- 2m: A VLSI Radix- 2m Cubic Unit","authors":"Eduardo Antonio Ceśar da Costa;Morgana Macedo Azevedo da Rosa","doi":"10.1109/TVLSI.2024.3486237","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3486237","url":null,"abstract":"Cubic operations are among the most used arithmetic operations in many applications that demand higher order simultaneous operand computation, such as cryptography and bicubic polynomial interpolation. This article proposes a novel VLSI radix-<inline-formula> <tex-math>$2^{m}$ </tex-math></inline-formula> cubic unit (RCU-<inline-formula> <tex-math>$2^{m}$ </tex-math></inline-formula>) capable of processing cubic operations at m bits simultaneously, with m values of 2 (RCU-4), 3 (RCU-8), and 4 (RCU-16). RCU-16 emerges as the most area-efficient configuration, surpassing RCU-8 and notably outperforming RCU-4. In the 8-bit scenario, RCU-16 achieves remarkable area savings, surpassing the literature’s proposed cubic unit by <inline-formula> <tex-math>$11.58times $ </tex-math></inline-formula>. Across all configurations, RCU-<inline-formula> <tex-math>$2^{m}$ </tex-math></inline-formula> consistently outperforms the automatically selected cube unit, with energy savings ranging from <inline-formula> <tex-math>$1.04times $ </tex-math></inline-formula> to <inline-formula> <tex-math>$2times $ </tex-math></inline-formula>. In application specific integrated circuit (ASIC) and field-programmable gate array (FPGA)-based analyses, RCU-16 consistently exhibits superior performance in both area and energy savings compared with RCU-4, RCU-8, and solutions from the literature. These findings emphasize the importance of adopting radix-<inline-formula> <tex-math>$2^{m}$ </tex-math></inline-formula> configurations, particularly RCU-16, for optimal energy-constrained VLSI applications.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"733-745"},"PeriodicalIF":2.8,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信