{"title":"A 40 TOPS Single-Chip Accelerator Enabling Low-Latency Inference for Deep Neural Networks","authors":"Xun He;Tao Cao;Youjiang Liu;Le Zhong;Guoping Xiao;Cong Yu","doi":"10.1109/TCSII.2025.3563062","DOIUrl":"https://doi.org/10.1109/TCSII.2025.3563062","url":null,"abstract":"To achieve low latency for edge applications, a single-chip sparse accelerator is proposed, which can conduct deep neural network (DNN) inference only using limited on-chip memory. Private memory is eliminated, and all memories are shared to reduce power and chip area. An adaptive and variable-length compression algorithm is proposed to store sparse DNNs. A weak-constrained pruning algorithm is proposed to resolve load balance issue in kernel level, which can achieve almost the same sparsity as unconstrained pruning schemes (UCP). Based on these works, a low latency inference accelerator is fabricated in 28-nm CMOS with 8256 MACs and 9.4 MB on-chip SRAM, which can achieve a latency of 0.44 ms for YOLO3 tiny. For high-sparsity layers, our chip can achieve <inline-formula> <tex-math>$6.1times $ </tex-math></inline-formula> speedup and a throughput of 40 TOPS. With a pruned YOLO model, our accelerator achieves <inline-formula> <tex-math>$6.7times $ </tex-math></inline-formula> lower latency and <inline-formula> <tex-math>$21.7times $ </tex-math></inline-formula> better energy efficiency than Jetson Orin. A high-speed evaluation platform is built to demonstrate real-time object detection at a throughput of 600 frames per second (fps) with a power of 1.34 W.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 6","pages":"848-852"},"PeriodicalIF":4.0,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144170908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low-Area-Cost VLSI Architecture of Fault-Aware High-Reliability Triple-Mode Polar Decoder Chip Reconfiguring SC and SCL Decoding","authors":"Xin-Yu Shih;Dong-Lin Wu;Wei-Lun Chang","doi":"10.1109/TCSII.2025.3561211","DOIUrl":"https://doi.org/10.1109/TCSII.2025.3561211","url":null,"abstract":"In this brief, we propose a low-area-cost fault-aware high-reliability Polar decoder VLSI architecture, resisting the unexpected faults randomly occurring in the internal storage elements. As for 2048-bit codeword length, our developed triple-mode chip can be well-reconfigured to perform SC decoding and SCL decoding with the list size (L) of 2 or 4. In the ASIC implementation with TSMC 40-nm multi-Vt CMOS technology, the total core area of our work only occupies 0.769 mm2 in chip layout, operating at a maximum frequency of 666.67 MHz. As compared with other state-of-the-arts, only our chip work can support high-reliability capability under 7.4% area overhead only.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 6","pages":"843-847"},"PeriodicalIF":4.0,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144170815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Variant of Turns-Ratio Independent Shoot-Through Current-Based Magnetically-Coupled Z-Source Inverter With Smooth DC-Link Voltage and Enhanced High-Gain","authors":"S. Konar;P. K. Gayen;S. S. Saha","doi":"10.1109/TCSII.2025.3560895","DOIUrl":"https://doi.org/10.1109/TCSII.2025.3560895","url":null,"abstract":"In a magnetically coupled impedance-source (MCIS) inverter, the low value of shoot-through (ST) current is desired to reduce the ratings and losses of the power converter. In this regard, turns-ratio-independent ST current is an important requirement. Very few inverters (active-switched Y-source MCIS inverters) are found in a recent article, which claims turns-ratio-independent ST current as a figure of merit. But it is observed that the voltage gain of the turn-ratio-independent ST current-based configuration is lesser than the turns-ratio-dependent ST current-based inverters. Therefore, this brief proposes a new variant of an active-switched MCIS inverter with a smooth DC-link voltage, which simultaneously exhibits enhanced voltage gain and a turns ratio-independent ST current. The voltage gain of the suggested network is higher than that of the recent equivalent MCIS inverters for the same magnitude and duration of shoot-through current, i.e., the voltage gains per shoot-through current (combined figure of merit) are significantly improved in the proposed inverter. The switching device power (SDP) is also reduced in the suggested inverter. Its desired operation is experimentally verified.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 6","pages":"858-862"},"PeriodicalIF":4.0,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144170779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 0.62-pJ/Bit 60-GHz OOK Receiver With Supply Interference Tolerance for Short-Range Interconnects","authors":"Junhong Liu;Yi Wu;Guangyin Feng;Rongbin Liu;Shaoxian Li;Chuan Hu;Xiuyin Zhang","doi":"10.1109/TCSII.2025.3560305","DOIUrl":"https://doi.org/10.1109/TCSII.2025.3560305","url":null,"abstract":"This brief presents an energy-efficient 60-GHz OOK receiver for massive short-range interconnects, addressing two key issues including interference in power distribution network and trade-off between sensitivity and energy efficiency. Variable gain low-noise amplifier with custom-designed bias-supply strategy is proposed to improve sensitivity and energy efficiency. Low-Q decoupling technique is proposed to improve supply interference tolerance, resulting in 2.6 times eye-opening in the eye diagram compared to the traditional one without low-Q decoupling. By co-designing the low-noise amplifier, envelope detector, and baseband amplifier, a prototype with proposed techniques was fabricated in a 65-nm LP CMOS process. Measurement results show that it achieves a maximum data rate of 16 Gbps with an energy efficiency of 0.62 pJ/bit and a sensitivity of -21.8 dBm, providing a high dynamic range, energy-efficient and interference robust solution for massive short-range interconnects.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 6","pages":"818-822"},"PeriodicalIF":4.0,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144171062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and Application of Mean and Square Root Circuits for Stochastic Computing","authors":"Shaowei Wang;Kai Shi;Yaohua Xu;Yi Wang;Yongqiang Zhang","doi":"10.1109/TCSII.2025.3560332","DOIUrl":"https://doi.org/10.1109/TCSII.2025.3560332","url":null,"abstract":"Stochastic computing (SC) is an unconventional computing paradigm that represents values using probabilities. This representation enables simple logic gates to perform complex arithmetic operations. This brief proposes two low hardware-cost stochastic mean circuits for even and odd inputs, respectively, along with a high-accuracy stochastic square root circuit. The circuits are designed by considering correlation technique and achieve excellent performance. Experimental results demonstrate that the proposed mean circuits surpass previous counterparts in computing accuracy and hardware cost. For instance, the proposed 9-input mean circuit can achieve at least an 86.7% reduction in mean square error (MSE) and a 28.6% reduction in area. For the square root circuit, the proposed design achieves a reduction in MSE of at least 19.6%. The proposed circuits are further demonstrated with the Niblack binarization algorithm, which shows superior performance of accuracy.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 6","pages":"838-842"},"PeriodicalIF":4.0,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144170898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cao Wan;Xiongyao Luo;Shuai Deng;Boyi Dong;Zhongzhiguang Lu;Yan’ge Wang;Shiquan Wang;Quan Xue;Yuanjin Zheng
{"title":"A Dual-Core Dual-Mode DCO for S/C-Band Radar","authors":"Cao Wan;Xiongyao Luo;Shuai Deng;Boyi Dong;Zhongzhiguang Lu;Yan’ge Wang;Shiquan Wang;Quan Xue;Yuanjin Zheng","doi":"10.1109/TCSII.2025.3559922","DOIUrl":"https://doi.org/10.1109/TCSII.2025.3559922","url":null,"abstract":"This brief presents a dual-core, dual-mode digital-controlled oscillator (DCO) featuring a novel transformer-based resonator that minimizes parasitics in the switch connection lines and enhances the design flexibility of the resonator. Fabricated using 65-nm CMOS technology, the DCO occupies a die area of 0.76 mm2. Powered by a 1.2-V supply, it consumes between 6.6 and 10.8 mW of DC power. The DCO, utilizing a 127-bit thermometer-code switched varactor array (SVA), achieves a total bandwidth of 63.89%, spanning from 2.52 to 4.88 GHz. Over the entire frequency range, phase noise is measured between −116.14 and −119.34 dBc/Hz, resulting in a FoM ranging from −178.3 to −182.0 dBc/Hz, and a <inline-formula> <tex-math>$mathbf {FoM}_{mathbf {T}}$ </tex-math></inline-formula> varying from −194.4 to −198.1 dBc/Hz. This DCO is well-suited as the frequency source for S-band and C-band radar applications.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 6","pages":"808-812"},"PeriodicalIF":4.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144170778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Preassigned-Time Sliding-Mode Control of Chaotic Memristive Neural Networks With Time-Varying Delays","authors":"Guoqing Gao;Hailong Ge;Gaohua Wang;Leimin Wang","doi":"10.1109/TCSII.2025.3559558","DOIUrl":"https://doi.org/10.1109/TCSII.2025.3559558","url":null,"abstract":"Preassigned-time (PASST) control of memrisitive neural networks has been a hot research point recently. Different from the finite-time control with stable time dependent on the initial condition of the system, this brief studies the PASST control, and the stable time of which is uncorrelated with the initial condition and can be set in advance. For a class of chaotic memristive neural networks with time delays, a sliding-mode based approach is designed to realize the PASST stability. Different from the finite-time stability, the upper bound of stable time is not related to or constricted by the initial condition, and it can be arbitrarily defined for practical requirement. Moreover, as the special cases, the exponential stability and fixed-time stability are also presented via the same framework of the sliding-mode based approach. Finally, a chaotic numerical example with several comparative cases are given to verify the validity of the control method of PASST stability results.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 6","pages":"823-827"},"PeriodicalIF":4.0,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144170942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Virtual-Synchronizer-Based Current Sharing Scheme in m-Phase Resonant DC-DC Converters System","authors":"Kangli Liu;Tianao Xiao;Peng Chen;Wenzhe Chen;Jianfeng Zhao","doi":"10.1109/TCSII.2025.3559142","DOIUrl":"https://doi.org/10.1109/TCSII.2025.3559142","url":null,"abstract":"The m-phase LLC resonant converter serves as an effective solution for reducing current stress and enabling high-power DC-DC conversion applications. However, variations in component parameters within the resonant tanks can lead to discrepancies in voltage gain, causing imbalanced output currents and thereby compromising the safe and reliable operation of the system. This brief delves into the underlying mechanism of current imbalance and elucidates the dynamic process through state-plane trajectory analysis. Consequently, a virtual-synchronizer based online current sharing scheme is proposed for m-phase resonant converters, which facilitates rapid online current balancing and ensures excellent synchronization performance. Moreover, it eliminates the need to designate a master phase, thereby enhancing the control flexibility, and the addition or removal of any phase does not disrupt the control process. The proposed method achieves synchronization and current sharing by constructing a virtual phase, without requiring additional hardware such as circuits and sensors. Results validate the effectiveness of the proposed method.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 6","pages":"853-857"},"PeriodicalIF":4.0,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144170817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jaehwan Kim;Mingu Han;Bayartulga Ishdorj;Taehui Na
{"title":"Offset-Tolerant Body-Biased Sense Amplifier With Rise-Time Control Technique for SRAM","authors":"Jaehwan Kim;Mingu Han;Bayartulga Ishdorj;Taehui Na","doi":"10.1109/TCSII.2025.3558562","DOIUrl":"https://doi.org/10.1109/TCSII.2025.3558562","url":null,"abstract":"In this brief, we propose an Offset-Tolerant Body-biased sense amplifier (OTB-SA) with a rise-time <inline-formula> <tex-math>$(T_{mathrm { RISE}})$ </tex-math></inline-formula> control technique to address the sensing failure issue that occurs when the input voltage difference <inline-formula> <tex-math>$({Delta }V_{mathrm { BL}})$ </tex-math></inline-formula> of a latch-type SA is smaller than the offset voltage <inline-formula> <tex-math>$(V_{mathrm { OS}})$ </tex-math></inline-formula>. The OTB-SA with <inline-formula> <tex-math>$T_{mathrm { RISE}}$ </tex-math></inline-formula> leverages body biasing and <inline-formula> <tex-math>$T_{mathrm { RISE}}$ </tex-math></inline-formula> control to enhance the differential signal injection (DSI) effect, thereby reducing both <inline-formula> <tex-math>$V_{mathrm { OS}}$ </tex-math></inline-formula> and energy consumption. Post-layout HSPICE simulation results using a 28 nm technology model indicate that, when target <inline-formula> <tex-math>$V_{mathrm { OS}}$ </tex-math></inline-formula> standard deviation <inline-formula> <tex-math>$({sigma }_{mathrm { OS}})$ </tex-math></inline-formula> is 5 mV, the OTB-SA with <inline-formula> <tex-math>$T_{mathrm { RISE}}$ </tex-math></inline-formula> achieves a 49.6% reduction in area and a 60.1% decrease in energy consumption compared to a voltage-latched SA (VLSA) without <inline-formula> <tex-math>$T_{mathrm { RISE}}$ </tex-math></inline-formula>. Moreover, compared to previous SAs, the OTB-SA with <inline-formula> <tex-math>$T_{mathrm { RISE}}$ </tex-math></inline-formula> showed up to 69.1% area reduction and up to 91.2% energy consumption reduction. Measurements from a 28 nm test chip confirmed that <inline-formula> <tex-math>$T_{mathrm { RISE}}$ </tex-math></inline-formula> control is effective, showing a trend where <inline-formula> <tex-math>${sigma }_{mathrm { OS}}$ </tex-math></inline-formula> decreases as <inline-formula> <tex-math>$T_{mathrm { RISE}}$ </tex-math></inline-formula> increases for OTB-SA.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 5","pages":"773-777"},"PeriodicalIF":4.0,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Boyang Zhang;Tianchen Ye;Zhifei Wang;Xin Liu;Tianyuan Zhong;Ruixu Wang;Weixin Gai
{"title":"A 4×106 Gb/s Mixed-Signal PAM-4 Transceivers for Optical Direct-Detect Applications With Adaptive Linearity Compensation in 28-nm CMOS","authors":"Boyang Zhang;Tianchen Ye;Zhifei Wang;Xin Liu;Tianyuan Zhong;Ruixu Wang;Weixin Gai","doi":"10.1109/TCSII.2025.3557793","DOIUrl":"https://doi.org/10.1109/TCSII.2025.3557793","url":null,"abstract":"Optical transmission has been widely employed in data-centers, but the complex impairments including the non-linearity induced by the laser modulator degrade the signal. Conventional optical modules use DSP-based transceivers to address these impairments, but they rely on advanced technology, consuming much power and area as well. A 4x106Gb/s mixed-signal PAM-4 transceivers fabricated in 28nm CMOS are proposed in this brief to reduce cost, area and power consumption. The transceiver supports adaptive linearity compensation with analog PAM4 level pre-distortion technique in TX. 4-tap FFE and 7-tap DFE including 4 floating taps are implemented in RX to take DFE’s advantage of not amplifying noise thanks to the mixed-signal structure. The transceiver achieves an optical sensitivity of -8.7dBm, which is 0.7dBm better than the DSP-based equalization methods under the same optical test environment. The energy efficiency and single-channel area are 4.42pJ/bit and 0.28mm2 respectively, both of which are better than reported 100Gb/s counterparts.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 5","pages":"728-732"},"PeriodicalIF":4.0,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}