IEEE Transactions on Very Large Scale Integration (VLSI) Systems最新文献

筛选
英文 中文
A Switched-Based Slew Rate and Gain Boosting Parallel-Path Amplifier for Switched-Capacitor Applications 一种用于开关电容的开关型摆率和增益提升并联路径放大器
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-04-22 DOI: 10.1109/TVLSI.2025.3557467
Javad Bagheri Asli;Alireza Saberkari;Atila Alvandpour
{"title":"A Switched-Based Slew Rate and Gain Boosting Parallel-Path Amplifier for Switched-Capacitor Applications","authors":"Javad Bagheri Asli;Alireza Saberkari;Atila Alvandpour","doi":"10.1109/TVLSI.2025.3557467","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3557467","url":null,"abstract":"A parallel-path amplifier (PPA) incorporating a switched-based slew rate and gain boosting stage as a feed-forward path, in parallel with a linear amplifier is introduced in this brief as an alternative to conventional analog amplifiers to achieve a high accuracy through the linear path and high slewing through the assisted feed-forward path. The feed-forward path employs a pre-amplifier, hysteresis-detector, and differential charge pumps, while the linear path includes a recycling folded-cascode amplifier. An analysis is performed to study the amplifier’s settling error with and without the feed-forward path, and also the trade-off between the dead-zone width of the hysteresis detector and the amplifier’s settling speed. The assisted feed-forward path has improved the slew rate <inline-formula> <tex-math>$times 2.5$ </tex-math></inline-formula>–800 V/<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>s, effective GBW by 15%, and dc gain by 16 dB at the expense of adding <inline-formula> <tex-math>$187.5~mu $ </tex-math></inline-formula>A extra current consumption and <inline-formula> <tex-math>$1.25~mu $ </tex-math></inline-formula>m<sup>2</sup> extra silicon area. To prove the concept, the proposed amplifier is used as a multiplying digital-to-analog converter (MDAC) amplifier of an 8-bit pipeline analog-to-digital converter (ADC), and the ADC is fabricated in a 65-nm CMOS process. The results reveal that the spurious free dynamic range (SFDR) and signal-to-noise and distortion ratio (SNDR) performances are improved by 6–7 dB in the presence of the feed-forward path.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1799-1802"},"PeriodicalIF":2.8,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Wireless PHY With Adaptive OFDM and Multiarmed Bandit Learning on Zynq System-on-Chip 基于Zynq片上系统的自适应OFDM和多臂强盗学习增强无线PHY
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-04-14 DOI: 10.1109/TVLSI.2025.3528865
Neelam Singh;Sumit J. Darak
{"title":"Enhancing Wireless PHY With Adaptive OFDM and Multiarmed Bandit Learning on Zynq System-on-Chip","authors":"Neelam Singh;Sumit J. Darak","doi":"10.1109/TVLSI.2025.3528865","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3528865","url":null,"abstract":"In this work, we present an intelligent and reconfigurable wireless physical layer (PHY) that dynamically adjusts the transmission parameters for a given radio frequency (RF) environment. The proposed PHY is based on orthogonal frequency division multiplexing (OFDM) and can dynamically augment OFDM with a finite impulse response (FIR) low-pass filter to improve the out-of-band emissions (OOBE). To make these adaptations intelligently, we employ multiarmed bandit (MAB)-based online learning algorithms, specifically upper confidence bound with control variate (UCB-CV). UCB-CV enhances traditional UCB by incorporating additional information such as interference level and transmit power, allowing it to manage interference more effectively. These algorithms are integrated into the PHY of an FPGA-based OFDM transceiver on the Zynq system-on-chip (SoC), facilitating real-time decision-making based on side-channel interference and other parameters. Our comparative analysis highlights the enhanced performance of the UCB-CV algorithm over the traditional UCB in terms of reducing the bit-error rate (BER) and managing interference more effectively. Unlike the traditional UCB, UCB-CV leverages side information through a control variate approach, incorporating the coefficient of variation (CV) into reward estimation to better handle interference. Additionally, we underline the advantages of filtered-OFDM (FOFDM) compared to standard OFDM. Notably, FOFDM significantly reduces OOBE by 20–75 dBW/Hz and improves BER. In environments with high interference, UCB-CV achieves a throughput improvement of 29.54% compared to UCB.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1651-1664"},"PeriodicalIF":2.8,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementing Homomorphic Encryption-Based Logic Locking in System-On-Chip Designs 在片上系统设计中实现同态加密逻辑锁定
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-04-10 DOI: 10.1109/TVLSI.2025.3556241
Ziyang Ye;Makoto Ikeda
{"title":"Implementing Homomorphic Encryption-Based Logic Locking in System-On-Chip Designs","authors":"Ziyang Ye;Makoto Ikeda","doi":"10.1109/TVLSI.2025.3556241","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3556241","url":null,"abstract":"This study presents a logic-locking scheme based on the binary ring learning with error (bin-RLWE) algorithm, implemented in a reduced instruction set computer-five (RISC-V) system-on-chip (SoC) design. Unlike traditional logic-locking methods that require providing users with raw locking parameters, the proposed approach secures critical logic paths in the privilege switching process without exposing these sensitive parameters. The implemented locking module itself consumes 3519 lookup tables (LUTs) and 2645 registers, leading to an overall overhead of 6.0% in LUTs and 6.9% in registers compared to the baseline system. The unlock process requires about <inline-formula> <tex-math>$2.6~mu $ </tex-math></inline-formula>s, introducing moderate performance impact and primarily affecting system-level operations while preserving user-level computational efficiency.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 7","pages":"2049-2053"},"PeriodicalIF":2.8,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-Time Driver Monitoring: Implementing FPGA-Accelerated CNNs for Pose Detection 实时驾驶员监控:实现姿态检测的fpga加速cnn
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-04-10 DOI: 10.1109/TVLSI.2025.3554880
Minjoon Kim;Jaehyuk So
{"title":"Real-Time Driver Monitoring: Implementing FPGA-Accelerated CNNs for Pose Detection","authors":"Minjoon Kim;Jaehyuk So","doi":"10.1109/TVLSI.2025.3554880","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3554880","url":null,"abstract":"As autonomous driving technology advances at an unprecedented pace, drivers are experiencing greater freedom within their vehicles, which accelerates the development of various intelligent systems to support safe and more efficient driving. These intelligent systems provide interactive applications between the vehicle and the driver, utilizing driver behavior analysis (DBA). A key performance indicator is real-time driver monitoring quality, as it directly impacts both safety and convenience in vehicle operation. In order to achieve real-time interaction, an image processing speed exceeding 30 frames/s and a delay time (latency) below 100 ms are generally required. However, expensive devices are often necessary to support this with software. Therefore, this article presents an algorithm and implementation results for immediate in-vehicle DBA through field-programmable gate array (FPGA)-based high-speed upper body-pose estimation. First, we define the 11 key points related to the driver’s pose and gaze and model a convolutional neural network (CNN) architecture that can quickly detect them. The proposed algorithm utilizes regeneration and retraining through layer reduction based on the residual-CNN model. In addition, the algorithm presents the results of its implementation at the register transfer level (RTL) level of the VCU118 FPGA and demonstrates simulation results of 34.7 frames/s and a delay time of 75.3 ms. Lastly, we discuss the results of linking a demo application and creating a vehicle testbed to experiment with the driver–vehicle interaction (DVI) system. A developed FPGA platform is implemented to process camera image input in real time. It reliably supports detected pose and gaze results at 30 frames/s via Ethernet. It also presents results that verify its application in screen control and driver monitoring systems.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 7","pages":"1848-1857"},"PeriodicalIF":2.8,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RISC-V CPU Design Using RRAM-CMOS Standard Cells 基于RRAM-CMOS标准单元的RISC-V CPU设计
IF 3.1 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-04-10 DOI: 10.1109/TVLSI.2025.3554476
Markus Fritscher;Max Uhlmann;Philip Ostrovskyy;Daniel Reiser;Junchao Chen;Jianan Wen;Carsten Schulze;Gerhard Kahmen;Dietmar Fey;Marc Reichenbach;Milos Krstic;Christian Wenger
{"title":"RISC-V CPU Design Using RRAM-CMOS Standard Cells","authors":"Markus Fritscher;Max Uhlmann;Philip Ostrovskyy;Daniel Reiser;Junchao Chen;Jianan Wen;Carsten Schulze;Gerhard Kahmen;Dietmar Fey;Marc Reichenbach;Milos Krstic;Christian Wenger","doi":"10.1109/TVLSI.2025.3554476","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3554476","url":null,"abstract":"The breakdown of Dennard scaling has been the driver for many innovations such as multicore CPUs and has fueled the research into novel devices such as resistive random access memory (RRAM). These devices might be a means to extend the scalability of integrated circuits since they allow for fast and nonvolatile operation. Unfortunately, large analog circuits need to be designed and integrated in order to benefit from these cells, hindering the implementation of large systems. This work elaborates on a novel solution, namely, creating digital standard cells utilizing RRAM devices. Albeit this approach can be used both for small gates and large macroblocks, we illustrate it for a 2T2R-cell. Since RRAM devices can be vertically stacked with transistors, this enables us to construct a <sc>nand</small> standard cell, which merely consumes the area of two transistors. This leads to a 25% area reduction compared to an equivalent CMOS <sc>nand</small> gate. We illustrate achievable area savings with a half-adder circuit and integrate this novel cell into a digital standard cell library. A synthesized RISC-V core using RRAM-based cells results in a 10.7% smaller area than the equivalent design using standard CMOS gates.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 9","pages":"2406-2414"},"PeriodicalIF":3.1,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10960690","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144904727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Xpikeformer: Hybrid Analog-Digital Hardware Acceleration for Spiking Transformers Xpikeformer:用于脉冲变压器的混合模拟-数字硬件加速
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-04-08 DOI: 10.1109/TVLSI.2025.3552534
Zihang Song;Prabodh Katti;Osvaldo Simeone;Bipin Rajendran
{"title":"Xpikeformer: Hybrid Analog-Digital Hardware Acceleration for Spiking Transformers","authors":"Zihang Song;Prabodh Katti;Osvaldo Simeone;Bipin Rajendran","doi":"10.1109/TVLSI.2025.3552534","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3552534","url":null,"abstract":"The integration of neuromorphic computing and transformers through spiking neural networks (SNNs) offers a promising path to energy-efficient sequence modeling, with the potential to overcome the energy-intensive nature of the artificial neural network (ANN)-based transformers. However, the algorithmic efficiency of SNN-based transformers cannot be fully exploited on GPUs due to architectural incompatibility. This article introduces Xpikeformer, a hybrid analog-digital hardware architecture designed to accelerate SNN-based transformer models. The architecture integrates analog in-memory computing (AIMC) for feedforward and fully connected layers, and a stochastic spiking attention (SSA) engine for efficient attention mechanisms. We detail the design, implementation, and evaluation of Xpikeformer, demonstrating significant improvements in energy consumption and computational efficiency. Through image classification tasks and wireless communication symbol detection tasks, we show that Xpikeformer can achieve inference accuracy comparable to the GPU implementation of ANN-based transformers. Evaluations reveal that Xpikeformer achieves a <inline-formula> <tex-math>$13times $ </tex-math></inline-formula> reduction in energy consumption at approximately the same throughput as the state-of-the-art (SOTA) digital accelerator for ANN-based transformers. In addition, Xpikeformer achieves up to <inline-formula> <tex-math>$1.9times $ </tex-math></inline-formula> energy reduction compared to the optimal digital ASIC projection of SOTA SNN-based transformers.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1596-1609"},"PeriodicalIF":2.8,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 28-nm Cascode Current Mirror-Based Inconsistency-Free Charging-and-Discharging SRAM-CIM Macro for High-Efficient Convolutional Neural Networks 基于28纳米Cascode电流镜的高效卷积神经网络无不一致性充放电SRAM-CIM宏
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-04-07 DOI: 10.1109/TVLSI.2025.3552641
Chunyu Peng;Jiating Guo;Shengyuan Yan;Yiming Wei;Xiaohang Chen;Wenjuan Lu;Chenghu Dai;Zhiting Lin;Xiulong Wu
{"title":"A 28-nm Cascode Current Mirror-Based Inconsistency-Free Charging-and-Discharging SRAM-CIM Macro for High-Efficient Convolutional Neural Networks","authors":"Chunyu Peng;Jiating Guo;Shengyuan Yan;Yiming Wei;Xiaohang Chen;Wenjuan Lu;Chenghu Dai;Zhiting Lin;Xiulong Wu","doi":"10.1109/TVLSI.2025.3552641","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3552641","url":null,"abstract":"Computing-in-memory (CIM) is an emerging approach to alleviate the von Neumann bottleneck and enhance energy efficiency and throughput. This brief introduces a 16-Kb static random access memory (SRAM) CIM macro for convolutional neural networks (CNNs), featuring a cascode current mirror-based inconsistency-free computing circuits (CICCs). The bias voltage of CICC is provided by a cascode current mirror (CCM) circuit. The proposed architecture improves the consistency and linearity of bitline (BL) charge and discharge rates in the analog current domain, enhancing computational accuracy. Additionally, the charge and discharge on the BLs represent the positive or negative calculation result, eliminating the need for extra encoding and logic circuits to handle sign bits. The SRAM-CIM macro achieves an energy efficiency of 59.1–134.0 TOPS/W and a throughput of 0.41 TOPS in a 28-nm CMOS technology, and the estimated inference accuracy on MNIST and CIFAR-10 datasets is 96.5% and 91.4%, respectively, with 5-bit input precision and 1-bit weight precision.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 7","pages":"2044-2048"},"PeriodicalIF":2.8,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flex-PE: Flexible and SIMD Multiprecision Processing Element for AI Workloads Flex-PE:用于人工智能工作负载的灵活和SIMD多精度处理元件
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-04-07 DOI: 10.1109/TVLSI.2025.3553069
Mukul Lokhande;Gopal Raut;Santosh Kumar Vishvakarma
{"title":"Flex-PE: Flexible and SIMD Multiprecision Processing Element for AI Workloads","authors":"Mukul Lokhande;Gopal Raut;Santosh Kumar Vishvakarma","doi":"10.1109/TVLSI.2025.3553069","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3553069","url":null,"abstract":"The rapid evolution of artificial intelligence (AI) models, from deep neural networks (DNNs) to transformers/large-language models (LLMs), demands flexible hardware solutions to meet diverse execution needs across edge and cloud platforms. Existing accelerators lack unified support for multiprecision arithmetic and runtime-configurable activation functions (AFs). This work proposes Flex-PE, a single instruction, multiple data (SIMD)-enabled multiprecision processing element that efficiently integrates multiply-and-accumulate operations with configurable AFs using unified hardware, including Sigmoid, Tanh, ReLU, and SoftMax. The proposed design achieves throughput improvements of up to <inline-formula> <tex-math>$16times $ </tex-math></inline-formula> FxP4, <inline-formula> <tex-math>$8times $ </tex-math></inline-formula> FxP8, <inline-formula> <tex-math>$4times $ </tex-math></inline-formula> FxP16, and <inline-formula> <tex-math>$1times $ </tex-math></inline-formula> FxP32, with maximum hardware efficiency for both iterative and pipelined architectures. An area-efficient iterative Flex-PE-based SIMD systolic array reduces DMA reads by up to <inline-formula> <tex-math>$62times $ </tex-math></inline-formula> and <inline-formula> <tex-math>$371times $ </tex-math></inline-formula> for input feature maps and weight filters in VGG-16, achieving 8.42 GOPS/W energy efficiency with minimal accuracy loss (<2%). Flex-PE scales from 4-bit edge inference to FxP8/16/32, supporting edge and cloud high-performance computing (HPC) while providing high-performance adaptable AI hardware with optimal precision, throughput, and energy efficiency.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1610-1623"},"PeriodicalIF":2.8,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cost-Optimized Double-Node-Upset-Recovery Latch Designs With Aging Mitigation and Algorithm-Based Verification for Long-Term Robustness Enhancement 成本优化的双节点破坏恢复锁存器设计与老化缓解和基于算法的长期鲁棒性增强验证
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-04-03 DOI: 10.1109/TVLSI.2025.3554117
Aibin Yan;Changli Hu;Jing Li;Na Bai;Zhengfeng Huang;Tianming Ni;Girard Patrick;Xiaoqing Wen
{"title":"Cost-Optimized Double-Node-Upset-Recovery Latch Designs With Aging Mitigation and Algorithm-Based Verification for Long-Term Robustness Enhancement","authors":"Aibin Yan;Changli Hu;Jing Li;Na Bai;Zhengfeng Huang;Tianming Ni;Girard Patrick;Xiaoqing Wen","doi":"10.1109/TVLSI.2025.3554117","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3554117","url":null,"abstract":"With the continuous advancement of CMOS technologies, soft errors, such as single-node upset (SNU) and double-node upset (DNU), caused by radiation in nanoscale integrated circuits, are becoming increasingly prominent. Meanwhile, transistor aging mitigation is indispensable for long-term robustness enhancement. First, to reduce the impact of radiation on circuits, we propose a novel DNU-recovery latch with low cost, namely, DURLC, only consisting of four dual-input C-elements (CEs) and four clock-gated input-split inverters for the storage of values. Second, we propose a DNU-recovery latch with moderate cost, namely, DURMC, based on seven CEs and four inverters, for convenience to optimize the latch to alleviate aging. The proposed DNU-recovery latch with mitigated aging is called DURMA. The latch employs a high-speed path to reduce delay without sacrificing performance when mitigating aging issues. Finally, we propose an algorithm-based verification method to validate the DNU recovery of the proposed latches. The simulation results show that, compared with the state-of-the-art robust latches, the proposed latches have the advantages of DNU recovery with moderate and even low cost, and meanwhile, aging is effectively mitigated for the DURMA latch.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1765-1773"},"PeriodicalIF":2.8,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid Number Theoretic Transform Architecture for Homomorphic Encryption 同态加密的混合数论变换体系
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-04-03 DOI: 10.1109/TVLSI.2025.3552852
Quang Dang Truong;Phap Duong-Ngoc;Hanho Lee
{"title":"Hybrid Number Theoretic Transform Architecture for Homomorphic Encryption","authors":"Quang Dang Truong;Phap Duong-Ngoc;Hanho Lee","doi":"10.1109/TVLSI.2025.3552852","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3552852","url":null,"abstract":"Fully homomorphic encryption (FHE) is an innovative cryptographic technology that has the potential to protect the privacy and confidentiality of data in the untrusted environments, such as public clouds or external parties. However, due to the inclusion of time-consuming polynomial arithmetic, FHE remains a challenge for computationally heavy applications. The number theoretic transform (NTT) is widely used in HE to reduce the complexity of polynomial multiplication. Therefore, implementing NTT in hardware for FHE has been explored in prior studies. However, due to the high hardware resource requirements, especially with a large number of moduli, hardware architecture supporting both NTT and its inverse transform (INTT) is still missing. This brief presents a hardware architecture for <inline-formula> <tex-math>$2^{17}$ </tex-math></inline-formula> NTT and INTT suitable for high-circuit depth CKKS-based HE schemes, satisfying both criteria of high speed and affordability for various FPGA platforms. The implementation results highlight that this design is area-efficient compared to the most related work and hardware-friendly for practical HE-based applications on FPGA devices.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 7","pages":"2039-2043"},"PeriodicalIF":2.8,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信