IEEE Transactions on Very Large Scale Integration (VLSI) Systems最新文献

筛选
英文 中文
A Two-Stage CMOS Amplifier With High Degree of Stability for All Capacitive Loads 一种适用于所有电容性负载的高稳定性两级CMOS放大器
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-02-05 DOI: 10.1109/TVLSI.2025.3532362
Germano Nicollini;Alessandro Bertolini
{"title":"A Two-Stage CMOS Amplifier With High Degree of Stability for All Capacitive Loads","authors":"Germano Nicollini;Alessandro Bertolini","doi":"10.1109/TVLSI.2025.3532362","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3532362","url":null,"abstract":"This article presents the conception, design, and realization of a fully differential two-stage CMOS amplifier, that is, unconditionally stable for any value of the capacitive load. This is simply achieved by sending a scaled replica of the output stage current to the amplifier virtual ground in order to create a left half-plane (LHP) zero in the loop gain that either cancels or tracks the output pole in all process, voltage, and temperature (PVT) conditions. Consequently, from a stability point of view, the amplifier behavior resembles that of a single-pole OTA. Starting from an existing two-stage gain-programmable amplifier, designed in a 0.18-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula> m bipolar-CMOS-DMOS (BCD) process that was able to drive only 10 pF without encountering into stability issues, a simple circuit has been added to extend the stability to any capacitive load value. An interesting and unusual method, based on the frequency behavior of the unloaded closed-loop amplifier output impedance, has been introduced to further verify the unconditional stability of this solution. Measurements show a high degree of stability in any load conditions. In the used 0.18-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula> m BCD technology, silicon area and current consumption of the extra circuit are only 0.0004 mm2 and <inline-formula> <tex-math>$2~mu $ </tex-math></inline-formula> A, respectively, with a 5-V power supply.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 5","pages":"1235-1243"},"PeriodicalIF":2.8,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143875131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Model Splitting Approach to Improve Reliability and Accuracy for Alternate Test of Analog/Mixed-Signal Circuits 一种提高模拟/混合信号电路交替测试可靠性和准确性的模型分割方法
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-02-05 DOI: 10.1109/TVLSI.2025.3530956
Jiaming Zhao;Naixin Zhou;Shibo Chen;Yijiu Zhao;Guibing Zhu
{"title":"A Model Splitting Approach to Improve Reliability and Accuracy for Alternate Test of Analog/Mixed-Signal Circuits","authors":"Jiaming Zhao;Naixin Zhou;Shibo Chen;Yijiu Zhao;Guibing Zhu","doi":"10.1109/TVLSI.2025.3530956","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3530956","url":null,"abstract":"Machine learning-based alternate test of analog/mixed-signal integrated circuits (ICs) has been widely studied in the last decade, which has the benefits of simplifying test equipment and decreasing test costs. However, due to low reliability and accuracy, it is hard to adopt the alternate test technique in the industry. In this article, a model splitting approach (MDSP approach) is proposed to improve the reliability and accuracy of the alternate test. The machine learning-based estimation model is “split” into two models with “complementary” performance (a “positive” model and a “negative” model). The “positive” model generates estimations that are no smaller than label values, while the “negative” model outputs estimations that are no larger than label values. Estimations with excessive differences between two models are identified as suspected estimations with large errors and filtered out. The rest results of “complementary” models are averaged to generate the final estimations. By comparing estimations of two models, the estimations with large error are filtered out effectively, and the estimation accuracy is improved significantly by fusing the results of two estimators. The MDSP approach is investigated with data from the commercial analog-to-digital converter and operational amplifier (OP). Results demonstrated that the proposed approach can improve test reliability and accuracy significantly.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 5","pages":"1224-1234"},"PeriodicalIF":2.8,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143875250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Flexible DA-Based Architecture for Computation of Inner Product of Variable Vectors 一种灵活的基于数据分析的变向量内积计算体系
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-02-04 DOI: 10.1109/TVLSI.2025.3528244
Anil Kali;Samrat L. Sabat;Pramod Kumar Meher
{"title":"A Flexible DA-Based Architecture for Computation of Inner Product of Variable Vectors","authors":"Anil Kali;Samrat L. Sabat;Pramod Kumar Meher","doi":"10.1109/TVLSI.2025.3528244","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3528244","url":null,"abstract":"The computation of inner products of any given pair of vectors is an indispensable requirement in several applications including artificial intelligence (AI), machine learning (ML), signal processing, image processing, communication, and many others. The throughput requirement of inner product computation varies widely for different applications. Moreover, the throughput of computation must match the requirements of the applications. It is therefore important to design flexible hardware for inner product computation that produces the desired throughput. Distributed arithmetic (DA) is a well-known approach for efficient inner product computation. This article presents an efficient DA-based architecture for computing the inner product of variable vectors, which could be tailored according to the throughput requirement of any given application and reused for different inner product lengths. The proposed designs could also be deployed to achieve a trade-off between throughput and area/energy consumption. In this article, we have used modified Booth encoding (MBE) to reduce the number of partial products and proposed a novel carry-save accumulator (CSA) for shortening the critical path delay. The proposed designs are synthesized by Cadence Genus using GPDK 90-nm technology library and place-and-route using Cadence Innovus for different inner product lengths and word lengths. As found from the postlayout synthesis results, the proposed designs offer savings of nearly 30% and 29% EPC and ADP over the bit-serial DA-based design on average for word lengths 8 and 16 and inner product lengths 8, 16, and 32, respectively.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"953-962"},"PeriodicalIF":2.8,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Capacitorless Flipped-Voltage-Follower-Based Low-Dropout Regulator Incorporating Adaptive-Compensation Buffer 基于自适应补偿缓冲器的无电容倒转电压跟踪器低差稳压器
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-02-04 DOI: 10.1109/TVLSI.2025.3535630
Yee-Chyan Tan;Harikrishnan Ramiah;S. F. Wan Muhamad Hatta;Chee-Cheow Lim;Rui P. Martins;Pui-In Mak;Yong Chen
{"title":"A Capacitorless Flipped-Voltage-Follower-Based Low-Dropout Regulator Incorporating Adaptive-Compensation Buffer","authors":"Yee-Chyan Tan;Harikrishnan Ramiah;S. F. Wan Muhamad Hatta;Chee-Cheow Lim;Rui P. Martins;Pui-In Mak;Yong Chen","doi":"10.1109/TVLSI.2025.3535630","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3535630","url":null,"abstract":"This brief presents an output-capacitorless low-dropout (OCL-LDO) regulator based on flipped-voltage-follower (FVF) and dual pMOS pass transistors. An adaptive-compensation buffer (ACB) dynamically regulates the operation of the pass transistors. Specifically, when the load current falls below 5 mA, only the smaller pass transistor is activated; otherwise, both pass transistors are engaged, thereby simultaneously mitigating the minimum load current requirement for FVF architecture and extending the load current ranging from 0 to 30 mA while maintaining stability without an external load capacitor. At 1.15-V supply voltage and 0-mA load current, the quiescent current is <inline-formula> <tex-math>$6~mu $ </tex-math></inline-formula>A. The output voltage is 1.0 V with a dropout voltage of 0.15 V. Measurements show that with a load current stepping from 0 to 30 mA at an edge time of 100 ns, the output voltage undershoot is 0.2 V with a recovery time of 200 ns while achieving a load regulation of 0.23 mV/V. Our OCL-LDO is fabricated in a 180-nm CMOS with an active area of 0.031 mm2.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 5","pages":"1422-1426"},"PeriodicalIF":2.8,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143875115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Virtual_N2_PDK: A Predictive Process Design Kit for 2-nm Nanosheet FET Technology Virtual_N2_PDK: 2nm纳米片场效应管技术的预测工艺设计工具包
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-01-28 DOI: 10.1109/TVLSI.2025.3529504
Yiying Liu;Minghui Yin;Huanhuan Zhou;Yunxia You;Weihua Zhang;Hongwei Liu;Chen Wang;Yajie Zou;Zhiqiang Li
{"title":"Virtual_N2_PDK: A Predictive Process Design Kit for 2-nm Nanosheet FET Technology","authors":"Yiying Liu;Minghui Yin;Huanhuan Zhou;Yunxia You;Weihua Zhang;Hongwei Liu;Chen Wang;Yajie Zou;Zhiqiang Li","doi":"10.1109/TVLSI.2025.3529504","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3529504","url":null,"abstract":"Nanosheet FETs (NSFETs) are considered promising candidates to replace FinFETs as the dominant devices in sub-5-nm processes. To encourage further research into NSFET-based integrated circuits, we present Virtual_N2_PDK, a predictive process design kit (PDK) for 2-nm NSFET technology. All assumptions are based on publicly available sources. Ruthenium (Ru) interconnects are employed for the buried power rail (BPR) and tight-pitch layers. Wrap-around contact (WAC) is also integrated into Virtual_N2_PDK to investigate its impact on circuit performance. By calibrating the BSIM-CMG model with 3-D technology computer-aided design (TCAD) electrothermal simulation results, SPICE models that account for self-heating effects (SHEs) are generated for devices with and without WAC. The simulation results show that with the WAC structure, the energy-delay product (EDP) of standard cells is reduced by an average of 25.18%, while the frequency of a 15-stage ring oscillator circuit increases by 26.05%.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"1004-1013"},"PeriodicalIF":2.8,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FPGA Implementation of Staged Projection Refining Multiple Orthogonal Matching Pursuit Algorithm for Compressed Sensing 面向压缩感知的分段投影细化多重正交匹配追踪算法的FPGA实现
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-01-28 DOI: 10.1109/TVLSI.2025.3529954
Sujuan Liu;Yichen Liang;Zixing Zhang;Peiyuan Wan
{"title":"FPGA Implementation of Staged Projection Refining Multiple Orthogonal Matching Pursuit Algorithm for Compressed Sensing","authors":"Sujuan Liu;Yichen Liang;Zixing Zhang;Peiyuan Wan","doi":"10.1109/TVLSI.2025.3529954","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3529954","url":null,"abstract":"Reconstruction algorithms are an integral part of compressed sensing (CS) theory, which can reliably reconstruct the original signal from the low-dimensional compressed signal. The orthogonal matching pursuit (OMP) algorithm has been widely studied and extensively selected in hardware implementations. However, the low reconstruction success rate of the OMP algorithm under high sparsity conditions has led to the proposal and application of more reconstruction algorithms in hardware implementations. In this article, a staged projection refining multiple OMP (SPR-MOMP) algorithm is proposed based on the OMP algorithm. This algorithm improves the reconstruction accuracy by refining the support set using a staged backtracking strategy. It also employs a multiple-atom selection strategy for parallel expansion of the support set, ensuring reconstruction efficiency. The reconstruction simulation demonstrates that the SPR-MOMP algorithm achieves a higher reconstruction success rate than the OMP algorithm, with fewer iterations. A hardware architecture applying the SPR-MOMP algorithm is designed and implemented on a Virtex UltraScale+ field-programmable gate array (FPGA) with <inline-formula> <tex-math>$N =1024$ </tex-math></inline-formula>, <inline-formula> <tex-math>$M =256$ </tex-math></inline-formula>, and <inline-formula> <tex-math>$K =36$ </tex-math></inline-formula>. The proposed architecture achieves a reconstruction signal-to-noise ratio (RSNR) of 44.27 dB, with 20-bit data width and 15-bit fractional width. The maximum clock frequency of the architecture is 200 MHz, enabling reconstruction within <inline-formula> <tex-math>$276.6~mu $ </tex-math></inline-formula>s. The proposed architecture achieves a lower dynamic power consumption of 1929 mW.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 5","pages":"1334-1347"},"PeriodicalIF":2.8,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143875259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reconfigurable 10T SRAM for Energy-Efficient CAM Operation and In-Memory Computing 可重构的10T SRAM节能凸轮操作和内存计算
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-01-27 DOI: 10.1109/TVLSI.2025.3526973
Zhang Zhang;Zhihao Chen;Jiedong Wang;Guangjun Xie;Gang Liu
{"title":"Reconfigurable 10T SRAM for Energy-Efficient CAM Operation and In-Memory Computing","authors":"Zhang Zhang;Zhihao Chen;Jiedong Wang;Guangjun Xie;Gang Liu","doi":"10.1109/TVLSI.2025.3526973","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3526973","url":null,"abstract":"The limitations of the von Neumann architecture in terms of power consumption and throughput are increasingly evident. In-memory computing is a promising computing paradigm to alleviate this limitation. This article proposes a high-speed and low-power 10T compute-static random-access memory (CSRAM) capable of conducting rowwise search operations and executing in-memory logic functions efficiently. A self-suppressed discharge scheme is implemented to curtail the power consumption of the search operation by reducing the discharge swing of the match lines (MLs). The rowwise search scheme avoids vertical data storage, enhancing the compatibility between different operation modes. The proposed 10T SRAM architecture addresses the issue of sneak currents effectively when multiple lines are activated. Additionally, decoupled read ports eliminate compute access disturbance. To validate the design, a 4Kb array is designed with a 40-nm CMOS technology. At a supply voltage (VDD) of 1.1 V, the in-memory logic operations are capable of operating at a frequency of 752 MHz, consuming 29.2 fJ/bit. In binary content-addressable memory (BCAM) search mode, the minimum energy consumption of 0.51 fJ/bit occurs at 0.8 V and 120 MHz.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"1065-1072"},"PeriodicalIF":2.8,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143675978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Area-Efficient VLSI Architecture for High-Throughput Computation of the 2-D DWT 一种用于二维DWT高吞吐量计算的面积高效VLSI架构
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-01-27 DOI: 10.1109/TVLSI.2025.3529690
Yuzhou Dai;Wei Zhang;Lin Shi;Qitao Li;Zhuolun Wu;Yanyan Liu
{"title":"An Area-Efficient VLSI Architecture for High-Throughput Computation of the 2-D DWT","authors":"Yuzhou Dai;Wei Zhang;Lin Shi;Qitao Li;Zhuolun Wu;Yanyan Liu","doi":"10.1109/TVLSI.2025.3529690","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3529690","url":null,"abstract":"In this article, an area-efficient VLSI architecture scheme for high-throughput computation of the 2-D discrete wavelet transform (DWT) is proposed, effectively applied in the context of aircraft cargo hold scenes. The proposed architecture aims to reduce computation and storage resources while maintaining the DWT-IDWT reconstructed image quality for the 9/7 discrete wavelet. The hardware implementation formulae based on the flipping architecture have been modified to reduce RAM storage bit width. By transforming the coefficients of the formula into hardware-friendly values, the required multiplication operations are split into two stages of addition. On this basis, a pipelined architecture is constructed to set the critical path delay (CPD) of the architecture to be close to the delay of a single adder, <inline-formula> <tex-math>$T_{a}$ </tex-math></inline-formula>, thereby achieving a high throughput. Compared to existing architectures in the research field, the proposed single-level 2-D DWT architecture achieves resource savings on the field-programmable gate array (FPGA) platform while ensuring good image reconstruction quality. The advantages of the multilevel 2-D DWT are even more pronounced. In the simulation results on the application-specific integrated circuit (ASIC) platform, the proposed architecture reduces computation time by at least 35.54% while achieving a higher level of decomposition, decreases the area-delay product (ADP) by at least 25.41%, and saves a significant amount of energy per image (EPI). Furthermore, the proposed folded architecture achieves close to 100% hardware utilization efficiency (HUE) in multilevel 2-D DWT computations.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 5","pages":"1292-1303"},"PeriodicalIF":2.8,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143875222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corrections to “GNN-Based Hardware Trojan Detection at Register Transfer Level Leveraging Multiple-Category Features” 更正“基于gnn的硬件木马检测在寄存器传输级别利用多类别功能”
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-01-24 DOI: 10.1109/TVLSI.2025.3525903
Peijun Ma;Ge Shang;Hongjin Liu;Jiangyi Shi;Weitao Pan;Yan Zhang;Yue Hao
{"title":"Corrections to “GNN-Based Hardware Trojan Detection at Register Transfer Level Leveraging Multiple-Category Features”","authors":"Peijun Ma;Ge Shang;Hongjin Liu;Jiangyi Shi;Weitao Pan;Yan Zhang;Yue Hao","doi":"10.1109/TVLSI.2025.3525903","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3525903","url":null,"abstract":"Presents corrections to the paper, (Corrections to “GNN-Based Hardware Trojan Detection at Register Transfer Level Leveraging Multiple-Category Features”).","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"902-902"},"PeriodicalIF":2.8,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10852349","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143496461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved Step-GRAND: Low-Latency Soft-Input Guessing Random Additive Noise Decoding 改进的Step-GRAND:低延迟软输入猜测随机加性噪声解码
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2025-01-23 DOI: 10.1109/TVLSI.2025.3529637
Syed Mohsin Abbas;Marwan Jalaleddine;Chi-Ying Tsui;Warren J. Gross
{"title":"Improved Step-GRAND: Low-Latency Soft-Input Guessing Random Additive Noise Decoding","authors":"Syed Mohsin Abbas;Marwan Jalaleddine;Chi-Ying Tsui;Warren J. Gross","doi":"10.1109/TVLSI.2025.3529637","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3529637","url":null,"abstract":"The ultrareliable low-latency communication (URLLC) application scenario requires the adoption of short linear block codes to satisfy the low-latency requirements. Guessing random additive noise decoding (GRAND) is a prominent universal decoding solution for short linear block codes that lends itself to efficient hardware implementations. GRAND-based hardware implementations generally offer reduced average decoding latency but their high worst-case (W.C.) latency renders them unsuitable for deployment in mission-critical applications. This article presents an improved version of step-GRAND, a soft-input variant of GRAND that features a novel test error pattern (TEP) generating approach. A novel very large-scale integration (VLSI) architecture is developed for the execution of the improved step-GRAND algorithm with reduced W.C. decoding latency. Application specific integrated circuit (ASIC) implementation results, employing low-power (LP) TSMC 65-nm CMOS technology, demonstrate that the proposed improved step-GRAND can achieve an average decoding latency as low as 10 ns for decoding a <inline-formula> <tex-math>$(128,105)$ </tex-math></inline-formula> linear block code at a target frame error rate (FER) of <inline-formula> <tex-math>$10^{-7}$ </tex-math></inline-formula>, while the W.C. decoding latency can reach <inline-formula> <tex-math>$300~text {ns}sim 1~mu text { s}$ </tex-math></inline-formula> depending on the parametric settings. Compared with the previously proposed baseline soft-input ordered reliability bits GRAND (ORBGRAND) hardware implementation with similar decoding performance at target FER of <inline-formula> <tex-math>$10^{-7}$ </tex-math></inline-formula>, the improved step-GRAND hardware achieves <inline-formula> <tex-math>$7 times sim 17times $ </tex-math></inline-formula> reduction in W.C. latency, <inline-formula> <tex-math>$7times $ </tex-math></inline-formula> reduction in power consumption, and <inline-formula> <tex-math>$37 times sim 66times $ </tex-math></inline-formula> higher area efficiency in the W.C. scenario. Furthermore, the proposed hardware can achieve an average throughput of up to 10.5 Gb/s and a W.C. throughput of <inline-formula> <tex-math>$102sim 350$ </tex-math></inline-formula> Mb/s.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"1028-1041"},"PeriodicalIF":2.8,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信