Ahmad T. Sheikh;Ali Shoker;Suhaib A. Fahmy;Paulo Esteves-Verissimo
{"title":"ResiLogic: Leveraging Composability and Diversity to Design Fault and Intrusion Resilient Chips","authors":"Ahmad T. Sheikh;Ali Shoker;Suhaib A. Fahmy;Paulo Esteves-Verissimo","doi":"10.1109/TVLSI.2025.3544860","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3544860","url":null,"abstract":"A long-standing challenge is the design of chips resilient to faults and glitches. Both fine-grained gate diversity and coarse-grained modular redundancy have been used in the past. However, these approaches have not been well-studied under other threat models where some stakeholders in the supply chain are untrusted. Increasing digital sovereignty tensions raise concerns regarding the use of foreign off-the-shelf tools and intellectual property (IP), or off-sourcing fabrication, driving research into the design of resilient chips under this threat model. This article addresses a threat model considering three pertinent attacks to resilience: distribution, zonal, and compound attacks. To mitigate these attacks, we introduce the <italic>ResiLogic</i> framework that exploits <italic>Diversity by Composability</i>: constructing diverse circuits composed of smaller diverse ones by design. This approach enables designers to develop circuits in the early stages of design without the need for additional redundancy in terms of space or cost. To generate diverse circuits, we propose a technique using E-Graphs with new rewrite definitions for diversity. Using this approach at different levels of granularity is shown to improve the resilience of circuit design in <italic>ResiLogic</i> up to <inline-formula> <tex-math>$times 5$ </tex-math></inline-formula> against the three considered attacks.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1751-1764"},"PeriodicalIF":2.8,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 3.7-nW 248-ppm/°C Subthreshold Self-Biased CMOS Current Reference","authors":"Jingjing Liu;Yuxuan Huang;Weijie Ge;Wenji Mo;Yuchen Wang;Feng Yan;Kangkang Sun;Bingjun Xiong;Zhipeng Li;Jian Guan","doi":"10.1109/TVLSI.2025.3546730","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3546730","url":null,"abstract":"A modified self-biased <inline-formula> <tex-math>$beta $ </tex-math></inline-formula>-multiplier-based current reference (CR) circuit is proposed for ultralow-power Internet of Things (IoT) application and is realized without any resistors, bipolar junction transistors (BJTs), or operational amplifiers (OPAs). The proposed CR circuit directly generates the reference current from a modified <inline-formula> <tex-math>$beta $ </tex-math></inline-formula>-multiplier, which is biased by a stacked diode-connected MOS transistor (SDMT)-based compensated through a complementary-to-absolute temperature (CTAT) voltage. The proposed CR is implemented in a standard 0.18-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>m CMOS process with an active area of 0.0069 mm<sup>2</sup> and almost all transistors operate in the subthreshold region. Measurement results show that the temperature coefficient (TC) of the CR is 248 ppm/°C in a temperature range from <inline-formula> <tex-math>$- 40~^{circ }$ </tex-math></inline-formula>C to <inline-formula> <tex-math>$125~^{circ }$ </tex-math></inline-formula>C. The proposed CR exhibits a line sensitivity (LS) of 0.33%/V within the supply voltage range of 0.8–1.4 V. The output of the CR at room temperature (<inline-formula> <tex-math>$25~^{circ }$ </tex-math></inline-formula>C) is 1.84 nA with a power consumption of 3.7 nW.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 7","pages":"2024-2028"},"PeriodicalIF":2.8,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Metastable-Dither-Based Digital Background Calibration of Interstage Gain Nonlinearity in Pipelined SAR ADC","authors":"Le Chen;Yue Cao;Lin Ling;Shubin Liu;Haolin Han","doi":"10.1109/TVLSI.2025.3544825","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3544825","url":null,"abstract":"A digital background calibration technique is proposed in this brief, utilizing comparator metastability to correct conversion errors from interstage gain errors and higher order nonlinearities for the first time. The method calibrates nonlinear conversion errors by injecting multilevel dithers and observing amplifier gain variations. It offers advantages, such as simple design, high accuracy, fast convergence, and low power consumption. Simulation results demonstrate the effectiveness of the technique, with the signal to noise and distortion ratio (SNDR) and spurious free dynamic range (SFDR) performances of a 14-bit two-stage pipelined successive approximation register analog-to-digital converter (SAR ADC) improving from 60.4 and 73.6 to 84.5 and 110.0 dB, respectively. The convergence speed of the calibration algorithm is 0.8 million samples.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1794-1798"},"PeriodicalIF":2.8,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An RRAM-Based Computing-in-Memory Macro With Low-Power Readout/Hold Circuits and Activation Differential Strategy for AdderNet","authors":"Zhihang Qian;Shengzhe Yan;Zhuoyu Dai;Zeyu Guo;Zhaori Cong;Yifan He;Chunmeng Dou;Feng Zhang;Jinshan Yue;Yongpan Liu","doi":"10.1109/TVLSI.2025.3546684","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3546684","url":null,"abstract":"AdderNet is an innovative neural network (NN) structure that substitutes multiplications with additions in convolutional operations, while computing-in-memory (CIM) is an efficient architecture that tackles the memory bottleneck for von Neumann architectures. Previous work has explored the SRAM-based CIM AdderNet circuits and demonstrates high energy efficiency. However, it still suffers low storage density, repetitive readout, and redundant comparisons. In this brief, an RRAM-based CIM macro is proposed for efficient AdderNet with the following innovations. First, RRAM cells are adopted to replace SRAM for high-density weight storage. A low-power readout and hold circuit is proposed to save redundant read power of weight data held for multiple cycles. Second, an 8-bit comparator with an early-stop strategy is proposed to compare 8-bit activations and weights in one cycle. Third, an activation (ACT) differential strategy is proposed to reduce redundant comparisons. The proposed 28-nm RRAM CIM macro achieves 12.8-TOPS/mm<sup>2</sup> peak area efficiency and 126-TOPS/W peak energy efficiency, which is <inline-formula> <tex-math>$3.0times $ </tex-math></inline-formula> and <inline-formula> <tex-math>$1.2times $ </tex-math></inline-formula> compared with the state-of-the-art AdderNet CIM macro.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 7","pages":"2029-2033"},"PeriodicalIF":2.8,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Information Leakage Through Physical Layer Supply Voltage Coupling Vulnerability","authors":"Sahan Sanjaya;Aruna Jayasena;Prabhat Mishra","doi":"10.1109/TVLSI.2025.3545804","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3545804","url":null,"abstract":"Power side-channel attacks are widely known for extracting information from data processed within a device while assuming that an attacker has physical access or the ability to modify the device. In this article, we introduce a novel side-channel vulnerability that leaks data-dependent power variations through physical layer supply voltage coupling (PSVC). Unlike traditional power side-channel attacks, the proposed vulnerability allows an adversary to mount an attack and extract information without modifying the device. In addition, unlike existing power-based remote attacks on field-programmable gate arrays (FPGAs), the PSVC vulnerability applies to both on-chip and on-board attacks. We assess the effectiveness of the PSVC vulnerability through three case studies, demonstrating several end-to-end attacks on general-purpose microcontrollers with varying adversary capabilities. These case studies provide evidence for the existence of the PSVC vulnerability, its applicability to on-chip as well as on-board side-channel attacks, and how it can eliminate the need for physical access to the target device, making it applicable to any off-the-shelf hardware. Our experiments also reveal that designing devices to operate at the lowest operational voltage significantly reduces the risk of PSVC side-channel vulnerability.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1715-1728"},"PeriodicalIF":2.8,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An RRAM Digital Computing-in-Memory Macro With Dual-Mode Multiplication and Maximum Value Rounding Adder Tree","authors":"Wang Ye;Hanghang Gao;Zhidao Zhou;Linfang Wang;Weizeng Li;Zhi Li;Jinshan Yue;Xiaoxin Xu;Jianguo Yang;Hongyang Hu;Chunmeng Dou","doi":"10.1109/TVLSI.2025.3545866","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3545866","url":null,"abstract":"Implementing digital computing-in-memory (DCIM) based on resistive memory (RRAM) faces several critical challenges due to the small signal margin, large device variations, and large energy- and area-overhead induced by the digital adder tree (AT). To address these issues, we propose an RRAM DCIM macro based on the standard foundry one-transistor-one-resistor (1T1R) cell array featuring: 1) dual-mode MAC operation for efficiency- or accuracy-oriented optimization; 2) margin-enhanced digitized unit (MEDU) to amplify the signal ratio; and 3) maximum value rounding AT (MVR-AT) to reduce its power- and area-overhead. A test chip is demonstrated using a 180 nm CMOS process to verify the concept. It achieves a peak energy efficiency (EF) of 63.08 TOPS/W in the efficiency-oriented mode and a minimum error rate of 1.58% in the accuracy-oriented mode. Their combination can meet the requirements of different workloads in AI computing tasks to optimize the overall power consumption with negligible accuracy loss.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1779-1783"},"PeriodicalIF":2.8,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 12-bit 2-GS/s Pipeline ADC in 28-nm CMOS With Linear-Error Self-Calibration","authors":"Yabo Ni;Lu Liu;Yong Zhang;Tao Zhu","doi":"10.1109/TVLSI.2025.3545364","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3545364","url":null,"abstract":"This article discusses a 12-bit 2-GS/s pipeline analog-to-digital converter (ADC). A self-calibration technique is employed to correct linear errors due to capacitor mismatches and interstage gain errors (IGEs). To counteract the effects of power supply and temperature variations, the first three stages of the ADC are equipped with least-mean-squares (LMS) IGE background calibrations, enhanced by the injection of a 1-bit dither into these stages. The computational engines designed for background calibration were reused for self-calibration, simplifying the overall design. An improved integrated input buffer drives the ADC, achieving a bandwidth of approximately 6.3 GHz, which is essential for high-speed data acquisition and processing. Moreover, a low-power operational transconductance amplifier (OTA) and reference buffer, both operating on a 1.0-V supply, are implemented to minimize the chip’s power consumption. The 12-bit pipeline prototype ADC, fabricated using a 28-nm CMOS process, operates at 2-GS/s with a 1.0-Vpp input signal. It delivers a signal-to-noise-and-distortion ratio (SNDR) of 58.92 dB and a spurious-free dynamic range (SFDR) of 82.23 dB. The ADC core consumes only 180 mW, resulting in a Schreier figure of merits (FoMs) of 156.4 dB.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1561-1569"},"PeriodicalIF":2.8,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 3D Unified Analysis Method (3D-UAM) for Wafer-on-Wafer Stacked Near-Memory Structure","authors":"Song Wang;Yixin Guo;Wei Tao;Xuerong Jia;Fujun Bai;Jie Tan;Yubing Wang;Liang Bai;Fuzhi Guo;Qi Liu;Jin Li;Peng Yin;Fenning Liu;Jing Liu;Xiaodong Long;Yanwu Han;Zhongcheng Yu;Mengzi Cheng;Song Chen;Xiping Jiang","doi":"10.1109/TVLSI.2025.3566468","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3566468","url":null,"abstract":"The wafer-on-wafer (WoW) stacked structure exhibits pioneering advantages in near-memory computing but encounters challenges in 3D analysis due to the miniaturization of vertical connection structures and the simplification of vertical drivers. This article introduces a 3D unified analysis method (3D-UAM), which facilitates standard-cell-level signal integrity (SI) analysis across the 3D WoW stacked structure with hybrid processes, including a comprehensive 3D vertical connection theoretical model that bridges the dynamic random access memory (DRAM) and logic netlists. The accuracy of the 3D-UAM is confirmed through consistency analysis with the results of the 3D field model. The authenticity of the 3D-UAM is validated through correlation analysis with the physical test results from the WoW stacked DRAM test chip. The practicality of the 3D-UAM is demonstrated through channel optimization on a 20-layer DRAM WoW structure and power integrity (PI) analysis for the WoW stacked structure.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 8","pages":"2186-2199"},"PeriodicalIF":2.8,"publicationDate":"2025-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144705291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
De-Ming Wang;Ke-Xuan Chen;Guan-Jin Xu;Shun Li;Jing Wu;Yu-Xuan Huang;Jian-Guo Hu
{"title":"An Implementation Method for 100% ASK Modulation Applied to NFC Tags","authors":"De-Ming Wang;Ke-Xuan Chen;Guan-Jin Xu;Shun Li;Jing Wu;Yu-Xuan Huang;Jian-Guo Hu","doi":"10.1109/TVLSI.2025.3566029","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3566029","url":null,"abstract":"Passive near field communication (NFC) tags rely on the carrier-provided clock for operation. They can receive 100% amplitude shift keying (ASK)-modulated information but are unable to respond using 100% ASK modulation. This limitation restricts the tag’s resistance to interference and its communication range. This article proposes a design approach that enables passive NFC tags to employ 100% ASK modulation (termed “strong modulation” in this article, while non-100% ASK modulation is referred to as “typical modulation”) for response. Addressing the critical issue where the tag’s clock is lost due to the antenna carrier being turned off during the low signal bits of the modulation signal, preventing the tag from continuing to function, this article introduces a high-precision recovery clock circuit as a solution. The recovery clock circuit consists of a digitally controlled oscillator (DCO) circuit composed of 12 sets of current mirrors and a logic circuit DCO calibrator. The design feasibility was validated through the postlayout parasitic extraction and the AMS mixed-signal simulation in Cadence Virtuoso, ensuring correct communication between the tag and the reader. By implementing the tag’s strong modulation response, the anti-interference capability of the tag’s returned signal can be significantly enhanced, effectively reducing the difficulty of demodulation at the receiving end and improving the tag’s poor long-distance communication capabilities. Comparatively, the minimum antenna coupling coefficient <italic>k</i> required for response under strong modulation is only 40.91% of that needed for typical modulation, enabling the tag to operate in weaker electromagnetic fields and exhibit better long-distance communication capabilities.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 8","pages":"2288-2298"},"PeriodicalIF":2.8,"publicationDate":"2025-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144702108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design of Low-Cost and High-Accurate 8-bit Logarithmic Floating-Point Arithmetic Circuits","authors":"Botao Xiong;Xingyu Shao;Chang Liu;Shize Zhang;Yuchun Chang","doi":"10.1109/TVLSI.2025.3563950","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3563950","url":null,"abstract":"Recent studies suggest that the 8-bit floating-point (FP) format plays an important role in the deep learning, where the <inline-formula> <tex-math>$E4M3$ </tex-math></inline-formula> (4-bit exponent, 3-bit mantissa) is suited for the natural language processing model and the <inline-formula> <tex-math>$E3M4$ </tex-math></inline-formula> is better on computer vision task. In this brief, the logarithmic number system (LNS) is used to simplify the design of FP8 multipliers and dividers because the multiplication and division can be performed by the addition and subtraction in the logarithmic domain. Furthermore, this brief finds that the 3- and 4-bit logarithmic and anti-logarithmic (Antilog) converters can be effectively realized by {<italic>x</i>, <inline-formula> <tex-math>$x+1$ </tex-math></inline-formula>} and {<italic>x</i>, <inline-formula> <tex-math>$x-1$ </tex-math></inline-formula>}. As a result, compared to the standard <inline-formula> <tex-math>$E4M3$ </tex-math></inline-formula> and <inline-formula> <tex-math>$E3M4$ </tex-math></inline-formula> multipliers, the cell area can be reduced by 32% and 40%. Compared to the standard <inline-formula> <tex-math>$E4M3$ </tex-math></inline-formula> and <inline-formula> <tex-math>$E3M4$ </tex-math></inline-formula> divider, the cell area can be reduced by 61% and 67%. In addition, compared with the INT8-based design, the area of convolution core using proposed multiplier is reduced by 33%. The accuracy loss of the quantized ResNet-50, MobileNet, and ViT-B based on the proposed convolution core are −0.12%, +0.38%, and +0.8%, which are better than the INT8-based design. In the end, the proposed divider can be used in the image change detection. The false rate is slightly reduced from 2.97% to 2.95% compared to the standard <inline-formula> <tex-math>$E3M4$ </tex-math></inline-formula> divider.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 7","pages":"2094-2098"},"PeriodicalIF":2.8,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}