{"title":"Corrections to “GNN-Based Hardware Trojan Detection at Register Transfer Level Leveraging Multiple-Category Features”","authors":"Peijun Ma;Ge Shang;Hongjin Liu;Jiangyi Shi;Weitao Pan;Yan Zhang;Yue Hao","doi":"10.1109/TVLSI.2025.3525903","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3525903","url":null,"abstract":"Presents corrections to the paper, (Corrections to “GNN-Based Hardware Trojan Detection at Register Transfer Level Leveraging Multiple-Category Features”).","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"902-902"},"PeriodicalIF":2.8,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10852349","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143496461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Syed Mohsin Abbas;Marwan Jalaleddine;Chi-Ying Tsui;Warren J. Gross
{"title":"Improved Step-GRAND: Low-Latency Soft-Input Guessing Random Additive Noise Decoding","authors":"Syed Mohsin Abbas;Marwan Jalaleddine;Chi-Ying Tsui;Warren J. Gross","doi":"10.1109/TVLSI.2025.3529637","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3529637","url":null,"abstract":"The ultrareliable low-latency communication (URLLC) application scenario requires the adoption of short linear block codes to satisfy the low-latency requirements. Guessing random additive noise decoding (GRAND) is a prominent universal decoding solution for short linear block codes that lends itself to efficient hardware implementations. GRAND-based hardware implementations generally offer reduced average decoding latency but their high worst-case (W.C.) latency renders them unsuitable for deployment in mission-critical applications. This article presents an improved version of step-GRAND, a soft-input variant of GRAND that features a novel test error pattern (TEP) generating approach. A novel very large-scale integration (VLSI) architecture is developed for the execution of the improved step-GRAND algorithm with reduced W.C. decoding latency. Application specific integrated circuit (ASIC) implementation results, employing low-power (LP) TSMC 65-nm CMOS technology, demonstrate that the proposed improved step-GRAND can achieve an average decoding latency as low as 10 ns for decoding a <inline-formula> <tex-math>$(128,105)$ </tex-math></inline-formula> linear block code at a target frame error rate (FER) of <inline-formula> <tex-math>$10^{-7}$ </tex-math></inline-formula>, while the W.C. decoding latency can reach <inline-formula> <tex-math>$300~text {ns}sim 1~mu text { s}$ </tex-math></inline-formula> depending on the parametric settings. Compared with the previously proposed baseline soft-input ordered reliability bits GRAND (ORBGRAND) hardware implementation with similar decoding performance at target FER of <inline-formula> <tex-math>$10^{-7}$ </tex-math></inline-formula>, the improved step-GRAND hardware achieves <inline-formula> <tex-math>$7 times sim 17times $ </tex-math></inline-formula> reduction in W.C. latency, <inline-formula> <tex-math>$7times $ </tex-math></inline-formula> reduction in power consumption, and <inline-formula> <tex-math>$37 times sim 66times $ </tex-math></inline-formula> higher area efficiency in the W.C. scenario. Furthermore, the proposed hardware can achieve an average throughput of up to 10.5 Gb/s and a W.C. throughput of <inline-formula> <tex-math>$102sim 350$ </tex-math></inline-formula> Mb/s.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"1028-1041"},"PeriodicalIF":2.8,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Chiplet Platform for Intelligent Radar/Sonar Leveraging Domain-Specific Reusable Active Interposer","authors":"Yafei Liu;Dejian Li;Zheng Yang;Chaoqin Zhang;Yunlai Zhang;Xiangyu Li;Mingwei Cao;Shouyi Yin","doi":"10.1109/TVLSI.2025.3529699","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3529699","url":null,"abstract":"Through chiplet reuse, chiplet-based system designs have emerged as a cost-effective solution for system-on-chips (SoCs), yet considerable silicon interposer costs often negate the benefits. Though general reusable interposers (GRIs) can lower the cost, they often compromise on performance and energy efficiency. In this article, a domain-specific reusable active interposer (active DSRI) approach is proposed for a better cost-efficiency tradeoff. Moreover, a chiplet platform based on an active DSRI designed for the intelligent radar/sonar (IRS) domain is introduced to facilitate rapid and customized SoC development. This platform offers flexible and energy-efficient interconnections tailored for IRS, platform infrastructure functions, and peripherals to simplify the chiplets. Furthermore, it integrates lightweight, composable standard 3-D interfaces across the chiplets and interposer, delivering up to 96-Gb/s bandwidth, 11.1-ns latency, and 0.62-pJ/bit energy efficiency, well controlling the cost and power penalties of SoC partition. Demonstrated with a customized hand gesture recognition sonar system (HGRSS) baseband SoC implemented on the proposed platform, it achieves similar performance to a monolithic SoC, with a recognition frame rate of 6286 frames/s, where overhead of the 3-D interface is only 6.86% in area and 4.84% in power. Our approach proves cost-effective, energy efficient, and customizable, moving system volume breakeven point forward by <inline-formula> <tex-math>$3.22sim 3.36$ </tex-math></inline-formula> times, and reducing the cost by 58.5%~59.8%. This represents a pioneering demonstration of reusable chiplets in HGRSS, showcasing the potential of our approach for broader domains.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"903-915"},"PeriodicalIF":2.8,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tim Fischer;Michael Rogenmoser;Thomas Benz;Frank K. Gürkaynak;Luca Benini
{"title":"FlooNoC: A 645-Gb/s/link 0.15-pJ/B/hop Open-Source NoC With Wide Physical Links and End-to-End AXI4 Parallel Multistream Support","authors":"Tim Fischer;Michael Rogenmoser;Thomas Benz;Frank K. Gürkaynak;Luca Benini","doi":"10.1109/TVLSI.2025.3527225","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3527225","url":null,"abstract":"The new generation of domain-specific AI accelerators is characterized by rapidly increasing demands for bulk data transfers, as opposed to small, latency-critical cache line transfers typical of traditional cache-coherent systems. In this article, we address this critical need by introducing the FlooNoC network-on-chip (NoC), featuring very wide, fully advanced extensible interface (AXI4) compliant links designed to meet the massive bandwidth needs at high energy efficiency. At the transport level, nonblocking transactions are supported for latency tolerance. In addition, a novel end-to-end ordering approach for AXI4, enabled by a multistream capable direct memory access (DMA) engine, simplifies network interfaces (NIs) and eliminates interstream dependencies. Furthermore, dedicated physical links are instantiated for short, latency-critical messages. A complete end-to-end reference implementation in 12-nm FinFET technology demonstrates the physical feasibility and power performance area (PPA) benefits of our approach. Using wide links on high levels of metal, we achieve a bandwidth of 645 Gb/s/link and a total aggregate bandwidth of 103 Tb/s for an <inline-formula> <tex-math>$8times 4$ </tex-math></inline-formula> mesh of processors’ cluster tiles, with a total of 288 RISC-V cores. The NoC imposes a minimal area overhead of only 3.5% per compute tile and achieves a leading-edge energy efficiency of 0.15 pJ/B/hop at 0.8 V. Compared with state-of-the-art (SoA) NoCs, our system offers three times the energy efficiency and more than double the link bandwidth. Furthermore, compared with a traditional AXI4-based multilayer interconnect, our NoC achieves a 30% reduction in area, corresponding to a 47% increase in GFLOPSDP within the same floorplan.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"1094-1107"},"PeriodicalIF":2.8,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Protecting Analog Circuits Using Switch Mode Time Domain Locking","authors":"Utkarsh Kumar;Sudhanshu Khanna;Ankit Mittal;Aatmesh Shrivastava","doi":"10.1109/TVLSI.2025.3528320","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3528320","url":null,"abstract":"Analog circuits remain vulnerable to different types of supply chain attacks including piracy, overproduction, counterfeiting, and reverse engineering. In this article, we present switch mode time domain locking (SMDL) technique to protect analog circuits. This technique integrates a locking mechanism into the time-domain functionality of the circuit. It uses random-key-based switching phases for analog circuits instead of fixed clocks that are conventionally used. The random switching phases are dependent on a key which can be made arbitrarily long. A correct key (CK) with correct alignment of phases can unlock circuit functionality. The locking technique can be applied to a variety of switch-mode analog circuits such as filters, amplifiers, regulators, among others. We implemented this technique on a folded cascode amplifier (FCA) and on a switched-capacitor bandgap reference (BGR) circuit. In both techniques, we employ a 128-bit key to lock the circuit functionality. The design is implemented in a 65-nm CMOS technology. An incorrect key (IK) introduces almost 100% variation in the circuit functionality, ensuring high level of security.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"916-928"},"PeriodicalIF":2.8,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information","authors":"","doi":"10.1109/TVLSI.2025.3527804","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3527804","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"C3-C3"},"PeriodicalIF":2.8,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10849954","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel High-Speed Adaptive Duobinary Digital Detector Based on the Feed-Forward Equalizer and the Maximum Likelihood Sequence Detector for Wireline Transceivers","authors":"Chaolong Xu;Fangxu Lv;Mingche Lai;Xingyun Qi;Qiang Wang;Zhang Luo;Shijie Li;Geng Zhang","doi":"10.1109/TVLSI.2025.3528127","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3528127","url":null,"abstract":"To solve the high bit error rate (BER) problem of conventional 56-Gb/s nonreturn-to-zero (NRZ) transceivers under high-insertion loss (IL) channels, this study proposes a high-speed adaptive duobinary (DB) digital detector based on the feed-forward equalizer (FFE) and the maximum likelihood sequence detector (MLSD). In this detector, adaptive FFE is combined with channel characteristics to generate DB signals and complete equalization, thus extending the transmission bandwidth and eye height and allowing a larger sampling phase offset. The parallel MLSD is used to complete the detection and decoding of DB signals to reduce the BER. An adaptive algorithm is proposed to avoid the long convergence time of the conventional zero-forcing (ZF) algorithm applied to the DB detector, so that it can be applied to various bit rates and IL channels. In this study, the verification of this DB detector is accomplished at 56 Gb/s. The platform based on a 56-Gb/s analog front-end chip (AFEC) and field-programmable gate array (FPGA) proves that the detector can work well in 12–56 Gb/s and multiple IL channels. The BER was less than 2e-8 at 56 Gb/s on −42-dB channel loss at 28 GHz. The structure can be well used for higher rate transceivers, such as 112 Gb/s.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"1042-1052"},"PeriodicalIF":2.8,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Publication Information","authors":"","doi":"10.1109/TVLSI.2024.3523620","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3523620","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"C2-C2"},"PeriodicalIF":2.8,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10849955","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Low-Cost and Triple-Node-Upset Self-Recoverable Latch Design With Low Soft Error Rate","authors":"Licai Hao;Lang Tian;Hao Wang;Shiyu Zhao;Qiang Zhao;Chunyu Peng;Chenghu Dai;Zhitin Lin;Xiulong Wu","doi":"10.1109/TVLSI.2025.3528199","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3528199","url":null,"abstract":"With the decrease in feature size of transistors, latches are more sensitive to single-event multiple node upset (MNU), including double node upset (DNU) and triple node upset (TNU). However, the reported TNU self-recoverable (TNUR) latches are facing problems with large areas and power consumption. Based on the polarity design, this article proposes a low-cost TNUR latch (LCTRL) with a low soft error rate (SER) in 28-nm CMOS technology. The proposed LCTRL mainly consists of four interlocked modules and a clock-gated inverter. Compared with the state-of-the-art TNUR latches, including LCTNURL, IHTRL, FATNU, and TRLW, the power consumption, D-Q delay, CLK-to-Q delay, area, and the power-delay–area product (PDAP) of the proposed LCTRL are reduced by 55.09%, 38.64%, 42.93%, 44.65%, and 83.50%, respectively. Due to the polarity design, the SER of the proposed LCTRL is the smallest among compared latches, which suggests that the proposed LCTRL is suitable for use in radiation environments.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"1108-1117"},"PeriodicalIF":2.8,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143675888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 285-nA Quiescent Current, 94.7% Peak Efficiency Buck Converter With AOT Control for IoT Application","authors":"Yuxin Zhang;Jueping Cai;Jizhang Chen;Lifeng Jiang;Yixin Yin","doi":"10.1109/TVLSI.2025.3527453","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3527453","url":null,"abstract":"An ultralow quiescent current dc-dc buck converter based on adaptive on-time (AOT) control is presented in this article. To minimize the energy wastage of the dc-dc buck converter circuit when the Internet-of-Things (IoT) device is in standby mode, a control loop with nano-ampere quiescent current is proposed in this converter. To reduce the quiescent current consumed by the voltage reference and improve its line sensitivity (LS), the voltage reference in the proposed converter is preregulated and based on the subthreshold CMOS implementation, with a quiescent current of only 20 nA. Meanwhile, for purpose of maintaining high efficiency of the converter under the ultralow load, an adaptive comparator based on the dynamic bias mode selection circuit is proposed, which converts the load conditions into time information and switches the bias current and gain of the comparator under ultralow loads, and the quiescent current of the comparator is only 65 nA. The proposed converter is implemented in a 0.18-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>m BCD process with an area of 1.35 mm2. Experimental results show that the converter has a minimum quiescent current of 285 nA, maintains more than 80% conversion efficiency over a load range of <inline-formula> <tex-math>$10~mu $ </tex-math></inline-formula>A–300 mA and a peak efficiency of 94.7%, and has an output of 0.9–4.8 V over a supply condition of 2–5.5 V.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"929-941"},"PeriodicalIF":2.8,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143675977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}