{"title":"IEEE Foundation - Reflecting on 50 Years of Impact","authors":"","doi":"10.1109/TVLSI.2024.3504313","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3504313","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 12","pages":"2408-2408"},"PeriodicalIF":2.8,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10791339","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142810597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Fast Design Optimization of On-Chip Equalizing Links Using Particle Swarm Optimization","authors":"Hyoseok Song;Kwangmin Kim;Gain Kim;Byungsub Kim","doi":"10.1109/TVLSI.2024.3508079","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3508079","url":null,"abstract":"We propose a fast algorithm to optimize on-chip equalizing link design utilizing a particle swarm optimization (PSO) method. Finding the optimal design parameters of an equalizing link requires too much computation time, because the dependency between design parameters and performances is too complex, while design space is too large. The proposed algorithm greatly reduces the optimization time by utilizing the superior efficiency of PSO in heuristic search. In experiment, on average, the proposed algorithm optimized a link design \u0000<inline-formula> <tex-math>$168times $ </tex-math></inline-formula>\u0000 faster than the previous state-of-the-art result, requiring only 1/256 evaluation counts, and reduced computation time from about 2 h to 45 s.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 1","pages":"1-9"},"PeriodicalIF":2.8,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Manipulated Lookup Table Method for Efficient High-Performance Modular Multiplier","authors":"Anawin Opasatian;Makoto Ikeda","doi":"10.1109/TVLSI.2024.3505920","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3505920","url":null,"abstract":"Modular multiplication is a fundamental operation in many cryptographic systems, with its efficiency playing a crucial role in the overall performance of these systems. Since many cryptographic systems operate with a fixed modulus, we propose an enhancement to the fixed modulus lookup table (LuT) method used for modular reduction, which we refer to as the manipulated LuT (MLuT) method. Our approach applies to any modulus and has demonstrated comparable performance compared with some specialized reduction algorithms designed for specific moduli. The strength of our proposed method in terms of circuit performance is shown by implementing it on Virtex7 and Virtex Ultrascale+ FPGA as the LUT-based MLuT modular multiplier (LUT-MLuTMM) with generalized parallel counters (GPCs) used in the summation step. In one-stage implementations, our proposed method achieves up to a 90% reduction in area and a 50% reduction in latency compared with the generic LuT method. In multistage implementations, our approach offers the best area-interleaved time product, with improvements of 39%, 13%, and 29% over the current state-of-the-art for ~256-bit, SIKE434, and BLS12-381 modular multipliers, respectively. These results demonstrate the potential of our method for high-performance cryptographic accelerators employing a fixed modulus.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 1","pages":"114-127"},"PeriodicalIF":2.8,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10777922","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 0.875–0.95-pJ/b 40-Gb/s PAM-3 Baud-Rate Receiver With One-Tap DFE","authors":"Jhe-En Lin;Shen-Iuan Liu","doi":"10.1109/TVLSI.2024.3507714","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3507714","url":null,"abstract":"This article presents a 40-Gb/s (25.6-GBaud) three-level pulse amplitude modulation (PAM-3) baud-rate receiver with one-tap decision-feedback equalize (DFE). A baud-rate phase detector (BRPD) that locks at the point with zero first postcursor is proposed. In addition, by reusing the BRPD’s error samplers, a weighting coefficient calibration is presented to select the DFE weighting coefficient that maximizes the top level of the eye diagram, thereby improving eye height across different channel losses. An inductorless continuous-time linear equalizer (CTLE) and a variable gain amplifier (VGA) are also included. The VGA adjusts the output common-mode resistance to control data swing, reducing power consumption when the required swing is small. Furthermore, by using the modified summer-merged slicers, the capacitance from the slicers to the VGA is reduced. Finally, a digital clock/data recovery (CDR) circuit is presented, which includes a demultiplexer (DeMUX) with a short delay time to reduce the loop latency. The 40-Gb/s PAM-3 receiver is fabricated in 28-nm CMOS technology. For a 25.6-Gbaud pseudorandom ternary sequence of \u0000<inline-formula> <tex-math>$3^{7}$ </tex-math></inline-formula>\u0000–1, the measured bit error rate (BER) is below \u0000<inline-formula> <tex-math>$10^{-12}$ </tex-math></inline-formula>\u0000 for channel losses of 9 and 17.5 dB. At a 9-dB loss, total power consumption is 35-mW with a calculated FoM of 0.875-pJ/bit. At 17.5-dB loss, total power consumption is 38-mW with a calculated FoM of 0.95-pJ/bit.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 1","pages":"168-178"},"PeriodicalIF":2.8,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"VSAGE: An End-to-End Automated VCO-Based ΔΣ ADC Generator","authors":"Ken Li;Tian Xie;Tzu-Han Wang;Shaolan Li","doi":"10.1109/TVLSI.2024.3507567","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3507567","url":null,"abstract":"This article presents VSAGE, an agile end-to-end automated voltage-controlled oscillator (VCO)-based \u0000<inline-formula> <tex-math>$Delta Sigma $ </tex-math></inline-formula>\u0000 analog-to-digital converter (ADC) generator. It exploits time-domain architectures and design mindset, so that the design flow is highly oriented around digital standard cells in contrast to the transistor-level-focused approach in conventional analog design. Through this, it speeds up and simplifies both the synthesis phase and layout phase. Combined with an efficient knowledge-machine learning (ML)-guided synthesis flow, it can translate input specifications to a full system layout with reliable performance within minutes. This work also features a compact oscillator and system modeling method that facilitates light-resource accurate computation and network training. The generator is verified with 12 design cases in 65-nm and 28-nm processes, proving its capability of generating competitive design with good process portability.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 1","pages":"128-139"},"PeriodicalIF":2.8,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Embedded Architecture for DDR5 DFE Calibration Based on Channel Stimulus Inversion","authors":"Mitchell Cooke;Nicola Nicolici","doi":"10.1109/TVLSI.2024.3505835","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3505835","url":null,"abstract":"The increase in performance promised by the recent generation of double data rate (DDR) memory, DDR5, is conditioned by addressing its signal integrity challenges. The DDR5 standard specifies a 4-tap decision feedback equalizer (DFE) at the memory receiver to deal with these challenges. Although adaptive equalization is a mature field, known methods for DFE calibration are limited by the DDR5 interface complexity and the equalization requirements mandated by its specification. In this article, we propose a novel approach based on linear inversion of channel stimulus that leverages specific architectural details of DDR5 and can tune memory devices deterministically at runtime. In addition to using few hardware resources relative to a modern memory controller, by operating at very low latency, this new approach facilitates periodic equalization when the DFE is offline, thus avoiding DFE error propagation during training inherent to adaptive techniques.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"793-806"},"PeriodicalIF":2.8,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MCM-SR: Multiple Constant Multiplication-Based CNN Streaming Hardware Architecture for Super-Resolution","authors":"Seung-Hwan Bae;Hyuk-Jae Lee;Hyun Kim","doi":"10.1109/TVLSI.2024.3504513","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3504513","url":null,"abstract":"Convolutional neural network (CNN)-based super-resolution (SR) methods have become prevalent in display devices due to their superior image quality. However, the significant computational demands of CNN-based SR require hardware accelerators for real-time processing. Among the hardware architectures, the streaming architecture can significantly reduce latency and power consumption by minimizing external dynamic random access memory (DRAM) access. Nevertheless, this architecture necessitates a considerable hardware area, as each layer needs a dedicated processing engine. Furthermore, achieving high hardware utilization in this architecture requires substantial design expertise. In this article, we propose methods to reduce the hardware resources of CNN-based SR accelerators by applying the multiple constant multiplication (MCM) algorithm. We propose a loop interchange method for the convolution (CONV) operation to reduce the logic area by 23% and an adaptive loop interchange method for each layer that considers both the static random access memory (SRAM) and logic area simultaneously to reduce the SRAM size by 15%. In addition, we improve the MCM graph exploration speed by \u0000<inline-formula> <tex-math>$5.4times $ </tex-math></inline-formula>\u0000 while maintaining the SR quality through beam search when CONV weights are approximated to reduce the hardware resources.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 1","pages":"75-87"},"PeriodicalIF":2.8,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Letian Guo;Jincheng Zhang;Lihe Nie;Jian Wang;Yong Chen;Junyan Ren;Shunli Ma
{"title":"A Harmonic-Suppressed GaN Power Amplifier Using Artificial Coupled Resonator","authors":"Letian Guo;Jincheng Zhang;Lihe Nie;Jian Wang;Yong Chen;Junyan Ren;Shunli Ma","doi":"10.1109/TVLSI.2024.3487002","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3487002","url":null,"abstract":"This brief presents an 11.5–17.5-GHz power amplifier (PA) with 32-dBm output power in a 0.25-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>m gallium nitride (GaN) process. Capacitively and inductively coupled resonators are used for impedance matching to achieve a flat in-band power gain and a high out-of-band rejection. Meanwhile, the output matching network provides a second-harmonic suppression to improve the average efficiency within the bandwidth of the PA. The measurements show that the proposed PA exhibits an output power of 31–32.5 dBm and a power gain of more than 10.5 dB from 11.5 to 17.5 GHz. Due to the matching networks providing convenient dc feed and dc block, the chip dimension is only <inline-formula> <tex-math>$2.1times 1.1$ </tex-math></inline-formula> mm2, corresponding to a power density of 0.77 W/mm2. The proposed PA demonstrates a competitive fractional bandwidth and power density in GaN PA monolithic microwave integrated circuits (MMICs).","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"882-886"},"PeriodicalIF":2.8,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143496462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Single-Ended High-Voltage-Compliant 11-bit Current-Steering Digital-to-Analog Converter for Adaptive Noise Cancellation in Power Over Data Line Networks","authors":"Felix Burkhardt;Florian Protze;Frank Ellinger","doi":"10.1109/TVLSI.2024.3496845","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3496845","url":null,"abstract":"Automotive Ethernet is considered to be the backbone of future in-vehicle data communication. One main feature is its ability to simultaneously transmit data and energy via power over data lines (PoDL). This article proposes the design of a single-ended high-voltage (HV)-compliant 11-bit current-steering digital-to-analog converter (DAC). The converter is tailored for the utilization as digitally controlled current source in an adaptive noise-cancellation filter for PoDL networks. Designed in an HV-compliant 180-nm bipolar complementary metal-oxide-semiconductor (BiCMOS) semiconductor technology, the DAC features a monolithically combined topology of two identical 10-bit low-voltage (LV) current-steering DACs supplied at 1.8 V and two complementary HV-compliant output current stages. Main design features of the segmented LV DAC are the utilization of single-ended current cells with an optimized switching logic, proposed to enhance the cells transient performance and energy efficiency. Furthermore, a newly derived <inline-formula> <tex-math>$Q^{4}$ </tex-math></inline-formula> asymmetric rotated walk switching scheme is investigated. At a maximum output voltage of 60 V, the proposed DAC can deliver a bidirectional output current with the amplitudes of up to 500 mA. The proposed DAC exhibits the highest voltage compliance combined with the highest output current compared with related works. It also features the second highest resolution. Operated at a sample rate of 10 MS/s with a resolution of 11 bit, a spurious-free dynamic range (SFDR) of 57.8 dB could be measured for a synthesized single tone at 100 kHz, as well as a maximum integral nonlinearity (INL) error of 1.61 LSB and a differential nonlinearity (DNL) error of 1.05 LSB.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"638-650"},"PeriodicalIF":2.8,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Parallel Feed-Forward Current Ripple Rejection (PFFCRR) Technique for High Load Current High PSRR nMOS LDOs","authors":"Yuhong Lu;Ting-An Yen;Rakshit Dambe Nayak;Shashank Alevoor;Bhushan Talele;Spoorti Patil;Keith Kunz;Bertan Bakkaloglu","doi":"10.1109/TVLSI.2024.3497803","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3497803","url":null,"abstract":"There is a significant demand in systems-on-chip (SoCs) for a high-power efficiency low-dropout regulator (LDO) that provides lower dropout voltage, higher load current, and low quiescent current. A high-power supply rejection ratio (PSRR) at the mid-to-high frequency band (0.1–10 MHz) is crucial for LDO to generate low-noise power supplies when driven by switching power converters. However, this presents a significant challenge to enhancing the PSRR since the pass field-effect transistor (FET) operates in the deep triode region at high-current and dropout conditions. In this article, a parallel feed-forward current ripple rejection (PFFCRR) technique is proposed to improve the PSRR performance regardless of the operation region of the nMOS pass FET. The proposed approach senses the supply-induced current ripple and cancels the original ripple through a current path that runs parallel to the nMOS pass FET. The proposed LDO is fabricated in a 180-nm BCD process. The proposed LDO achieves a PSRR better than −35 dB up to 10 MHz at 300-mV dropout voltage with 0.5-A load current and a load capacitor of <inline-formula> <tex-math>$2.2~mu $ </tex-math></inline-formula>F. The PFFCRR approach achieves a PSRR improvement of 18 dB at 1 MHz at 100-mV dropout voltage with a 2.15-A load current when the pass FET operates in the deep triode region. Moreover, the proposed LDO enhances the transient performance with an overshoot and an undershoot of 40.54 and 36.45 mV, respectively, against <inline-formula> <tex-math>$Delta {I}_{text {LOAD}}$ </tex-math></inline-formula> of 1 A with a slew rate of 1 A/<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>s.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"651-661"},"PeriodicalIF":2.8,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}