Xufeng Liao;Jiabin Wang;Peiyuan Fu;Yu Du;Lianxi Liu
{"title":"A Low-Ripple DIDO DC–DC Hybrid Interface With Optimal-Hysteresis-Controlled MPPT for TEH","authors":"Xufeng Liao;Jiabin Wang;Peiyuan Fu;Yu Du;Lianxi Liu","doi":"10.1109/TVLSI.2025.3540106","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3540106","url":null,"abstract":"This article proposes a dual-input-dual-output (DIDO) dc-dc hybrid interface for thermoelectric energy harvesting (TEH) applications with high efficiency and low output ripple. A load-first ordered power distributive control (OPDC) strategy is used to recycle the excess thermoelectric energy (TE) in time. Utilizing the digital adaptive <sc>on</small>-time (DAOT) technique, the output ripple can be reduced during battery (BAT) power supply. A hysteresis-controlled maximum power point tracking (MPPT) technique is proposed to track the variation of the internal resistance <inline-formula> <tex-math>$text {R}_{text {TE}}$ </tex-math></inline-formula> of the thermoelectric generator (TEG), which achieves high tracking efficiency over a wide <inline-formula> <tex-math>$text {R}_{text {TE}}$ </tex-math></inline-formula> range. By trading the tracking efficiency and loss off in the MPPT, an optimization method for hysteresis window is proposed. In addition, an analog zero-crossing detector (ZCD) without calibration is adopted to improve the end-to-end efficiency. The proposed hybrid interface is realized by 0.18-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>m standard CMOS process with a core area of <inline-formula> <tex-math>$0.91 times 0.61$ </tex-math></inline-formula> mm2. Measured results show that the proposed interface can harvest TE over the <inline-formula> <tex-math>$text {R}_{text {TE}}$ </tex-math></inline-formula> variation range of 1–<inline-formula> <tex-math>$1000 ; Omega $ </tex-math></inline-formula>, with a peak tracking efficiency of 99.6% and an output ripple as low as 35 mV. It also achieves a peak end-to-end efficiency of 87% and an output power range of <inline-formula> <tex-math>$1 ; mu $ </tex-math></inline-formula>W −10 mW.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1541-1550"},"PeriodicalIF":2.8,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 1-MS/s 64-Channel Data Acquisition System With Full-Scale Input Range for Area-Sensitive Application Achieved 165.3-dB FoMS/Ch","authors":"Runkun Zhu;Guangyi Chen;Xueyou Shi;Bowei An;Fei Zhou;Yacong Zhang;Wengao Lu;Zhongjian Chen","doi":"10.1109/TVLSI.2025.3539676","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3539676","url":null,"abstract":"This article presents a novel data acquisition system (DAS) designed for area-sensitive applications, including automotive instrumentation, launch vehicles, and satellites, which require efficient area utilization and full-swing input/output capabilities for diverse sensors. Traditional approaches to achieving full swing, such as using CMOS input transistors or generating negative voltage via charge pumps, either result in high harmonic distortion (HD) or complicate chip design. To address these challenges, we propose a complementary analog front-end (CAFE) that shifts the input signal by a fixed voltage, enabling operation in a more linear region while employing feedback to minimize HD. The system incorporates two analog-to-digital converters (ADCs) that convert the input signal and subtract the digital output of the common mode voltage (<inline-formula> <tex-math>$V_{text {CM}}$ </tex-math></inline-formula>), facilitating effective data fusion for complete AD conversion. Fabricated using a 180-nm 1P5M BCD process, the DAS consumes 118.81 mW from a 5.0-V power supply, providing 64 channels in a compact area of <inline-formula> <tex-math>$4.0times 3.2$ </tex-math></inline-formula> mm. At a sampling rate of 1 MS/s, it achieves an effective number of bits (ENoBs) of 13.16, with a power consumption of 1.856 mW per channel and a dynamic range of 83.3 dB, resulting in an impressive figure of merit per channel (FoMS/Ch) of 165.3 dB.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1551-1560"},"PeriodicalIF":2.8,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Robust Computing-in-Memory Macro With 2T1R1C Cells and Reused Capacitors for Successive-Approximation ADC","authors":"Rui Xiao;Minghan Jiang;Xinran Li;Haibin Shen;Kejie Huang","doi":"10.1109/TVLSI.2025.3539826","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3539826","url":null,"abstract":"Computing-in-memory (CIM) has emerged as a practical paradigm to bypass the von Neumann bottleneck. However, traditional CIM schemes face challenges due to the nonideal characteristics of nonvolatile memory (NVM). To address this issue, this work provides a resistive random access memory (RRAM)-based CIM macro employing two-transistor-one-RRAM–one-capacitor (2T1R1C) cells, with capacitors reused for the successive-approximation analog-to-digital converter (SAR ADC). Single-level RRAM is utilized to mitigate resistance variation. The multiply-accumulate (MAC) operation is performed via the charge and discharge of capacitors, enhancing robustness across different process, voltage, and temperature (PVT) corners. The capacitors in 2T1R1C cells are repurposed as sampling capacitors to integrate the ADC with the array. A precision-adjustable SAR (PA-SAR) logic is proposed to generate partial sums at varying precision levels aligned with different input bits, optimizing energy efficiency while maintaining reliability. Our proposed 2T1R1C array features an average area of <inline-formula> <tex-math>$3.403~mu $ </tex-math></inline-formula>m2 for each cell, which accounts for 87.46% of the total macro area. The total macro area is 1.020 mm2 with a capacity of 256 Kb, achieving an energy density of 0.201 TOPS/mm2. The PA-SAR logic boosts energy efficiency to 44.71 TOPS/W, marking a 38.55% improvement over conventional full-precision schemes.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1693-1704"},"PeriodicalIF":2.8,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Publication Information","authors":"","doi":"10.1109/TVLSI.2025.3539514","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3539514","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"C2-C2"},"PeriodicalIF":2.8,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10903159","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editorial: Renewed Excellence for 2025–2026","authors":"Mircea R. Stan","doi":"10.1109/TVLSI.2024.3520396","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3520396","url":null,"abstract":"I am happy and honored to have been reappointed as Editor in Chief (EiC) for the IEEE Transactions on VLSI Systems (TVLSI) for another two-year term. As I continue my efforts to improve the quality of the journal, I am grateful for the renewed trust placed in me by the three IEEE sponsoring societies (CASS, SSCS and CS) and by the VLSI community at large. Contrary to a feared slowdown due to increased difficulties with scaling, the field of Very Large Scale Integration (VLSI) has actually grown at an increasingly fast rate as it provides the hardware backbone for the insatiable AI applications which are taking over the world. The H100/200 GPUs, which are essential for AI training, are the largest “conventional” integrated circuits (IC) with 80 billion transistors, while the wafer-scale WSE2/3, which can provide significant improvements in AI inference, are absolute behemoths with 4 trillion transistors! Mr. Moore can be proud there in heaven for what our industry is able to deliver!","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"603-626"},"PeriodicalIF":2.8,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10903548","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information","authors":"","doi":"10.1109/TVLSI.2025.3539516","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3539516","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"C3-C3"},"PeriodicalIF":2.8,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10903516","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143496460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 1 mW–10 W, Over 86.4% Efficiency Tri-Mode Buck Converter With Ripple-Based Control for Mobile Applications","authors":"Shuyu Zhang;Menglian Zhao;Shuang Song","doi":"10.1109/TVLSI.2025.3542096","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3542096","url":null,"abstract":"To achieve high efficiency over wide load range for modern mobile applications, this brief proposes a ripple-based V2-controlled buck converter operating with pulsewidth modulation (PWM)/pulse-frequency modulation (PFM)/load-adaptive standby mode (LASM). On system level, a delay-based load-adaptive V<inline-formula> <tex-math>$_{text {ON}}$ </tex-math></inline-formula> generator is exploited in LASM at ultralight load. When the output ripple is kept below its maximum restriction, the switching loss of the converter is further minimized in LASM compared with prior operation modes, including PFM, pulse-skip modulation (PSM), multiple-sawtooth PWM (MSPWM), and deep green mode (DGM). On circuit level, a dynamic-biased dual-offset hysteresis comparator is proposed. Together with other blocks that can be disabled, the quiescent consumption of controller in LASM is reduced to only 14 <inline-formula> <tex-math>$mu $ </tex-math></inline-formula>W. Fabricated in a 130-nm BCD process, the proposed converter can provide a 1.8-V output with a power density of 4.11 W/mm2. It achieves a 93.2% peak efficiency, while the efficiency can be maintained above 86.4% in 1 mW–10 W (<inline-formula> <tex-math>$times 10~000$ </tex-math></inline-formula>) load range.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1784-1788"},"PeriodicalIF":2.8,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Prajuab Pawarangkoon;Rafidah Ahmad;Ruhaifi Abdullah Zawawi;Asrulnizam Abd Manaf;Wanlop Surakampontorn;Surachoke Thanapitak
{"title":"A Nanopower EEG Low-Pass Filter Using Current-Sharing Vertical Differential Pairs","authors":"Prajuab Pawarangkoon;Rafidah Ahmad;Ruhaifi Abdullah Zawawi;Asrulnizam Abd Manaf;Wanlop Surakampontorn;Surachoke Thanapitak","doi":"10.1109/TVLSI.2025.3540116","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3540116","url":null,"abstract":"A follower-based <inline-formula> <tex-math>${g}_{m} - C$ </tex-math></inline-formula> low-pass filter that employs CMOS vertical source-couple-pair (VSCP) transconductors is proposed for practical use in EEG acquisition systems. The VSCP transconductor operates as a <inline-formula> <tex-math>${g}_{m}$ </tex-math></inline-formula> cell with current sharing and linearity enhancement features. It is applied in the first- and second-order <inline-formula> <tex-math>${g}_{m} - C$ </tex-math></inline-formula> sections cascaded to form a third-order low-pass filter targeting a 150-Hz bandwidth. To mitigate the effects of biasing current source mismatch, dynamic element matching (DEM) is optionally applied to the relevant biasing current source pairs, resulting in second harmonic distortion (HD2) and noise suppression. Implemented in a 0.18-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>m process, the proposed filter consumes 16.3-nW power from a 1.2-V supply. Thanks to the DEM and VSCPs, the filter achieves a 150-mVP linear input range [measured at 1% total harmonic distortion (THD)], whereas the input-referred noise of <inline-formula> <tex-math>$43~mu text {V}_{text {rms}}$ </tex-math></inline-formula> is obtained leading to a filter dynamic range (DR) of 65.15 dB. Overall performance comparisons with other recent nanopower filters indicate that the figure of merit (FoM) of this proposed filter is comparable, while the linear input range is larger.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1530-1540"},"PeriodicalIF":2.8,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuefei Wang;Wendong Mao;Huihong Shi;Jin Sha;Zhongfeng Wang
{"title":"An Energy-Efficient FPGA Accelerator for Swin Transformer","authors":"Yuefei Wang;Wendong Mao;Huihong Shi;Jin Sha;Zhongfeng Wang","doi":"10.1109/TVLSI.2025.3540844","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3540844","url":null,"abstract":"Recently, transformers have shown strong performance in tasks such as computer vision and natural language processing. Notably, Swin Transformer has gained significant attention for its low computational complexity and impressive performance in computer vision tasks, due to its window attention mechanism and hierarchical architecture. However, these features also make hardware deployment more complicated. In this brief, we present an energy-efficient field-programmable gate array (FPGA) accelerator for Swin Transformer to support the hierarchical architecture and execute the window attention. First, we introduce a systolic array with alterable datapath (SAAD) to conduct the window attention. Second, we split the patch merging operation and design a data rearrangement module, which reduces the computing latency induced by the data rearrangement in Swin Transformer. Third, we present a parallelized dual-array dataflow to support different computing operations in Swin Transformer. We implement the accelerator on the Xilinx XCZU19EG platform. The proposed architecture achieves a throughput per digital signal processing (DSP) of 0.630 giga operations per second (GOPS)/DSP, which is <inline-formula> <tex-math>$1.94times $ </tex-math></inline-formula> higher than existing works.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1774-1778"},"PeriodicalIF":2.8,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zizheng Dong;Shuaipeng Li;Weijia Zhu;Ang Li;Qin Wang;Naifeng Jing;Weiguang Sheng;Jianfei Jiang;Zhigang Mao
{"title":"A Hierarchical 3-D Physical Design Method for Ultralarge-Scale Logic-on-Memory CGRA Chip","authors":"Zizheng Dong;Shuaipeng Li;Weijia Zhu;Ang Li;Qin Wang;Naifeng Jing;Weiguang Sheng;Jianfei Jiang;Zhigang Mao","doi":"10.1109/TVLSI.2025.3538883","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3538883","url":null,"abstract":"Face-to-face bonded 3-D (F2F 3D) technology, with the potential to significantly reduce chip area while enhancing performance, stands as one of the most promising ways to extend Moore’s Law. However, current 3-D physical design flows are often modifications of 2-D design flows and rely on technical personnel to manually modify technical files. Furthermore, existing research on 3-D design flow primarily focuses on module implementation, with very few studies addressing hierarchical design methods for large-scale chips. In this article, we first introduce a 3-D physical design flow which concurrently optimizes the timing of both the logic tier and the memory tier, achieving synchronized physical design for both tiers. Then, we develop a bottom-up hierarchical 3-D physical design flow to extend the 3-D design flow to large-scale chip design. Through coordinated power planning, clock tree design, and interconnect unit design, we enhance the power, performance, and area (PPA) metrics of the entire chip. Using our RTL-to-GDS physical design flow, we successfully implemented a 28-nm CMOS logic-on-memory (LoM) 3-D coarse-grained reconfigurable architecture (CGRA) chip with over 50 million gates. Experimental results demonstrate that our 3-D flow improves timing by 16.1% while reducing voltage drop by 38.6% compared to the 2-D design. In addition, the power-delay product (PDP) of the 3-D chip decreases by 10.2%, showcasing better performance.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 6","pages":"1502-1515"},"PeriodicalIF":2.8,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}