{"title":"CINE: A 4K-UHD Energy-Efficient Computational Imaging Neural Engine With Overlapped Stripe Inference and Structure-Sparse Kernel","authors":"Kai-Ping Lin;Yu-Chun Ding;Chun-Yeh Lin;Yong-Tai Chen;Chao-Tsung Huang","doi":"10.1109/LSSC.2023.3343913","DOIUrl":"https://doi.org/10.1109/LSSC.2023.3343913","url":null,"abstract":"Recently, convolutional neural networks have achieved great success in high-resolution computational imaging (CI) applications, such as super-resolution, image denoising, and image style transfer. However, it demands an enormous number of external memory access, i.e., DRAM bandwidth, and intensive computation while inferencing deeper models for high-quality images. In this letter, an energy-efficient CI neural engine, CINE, is proposed with three key features: 1) overlapped stripe inference flow; 2) structure-sparse convolution kernel; and 3) weight-rotated allocation unit. As a result, CINE can provide 4.6-8.3 TOP/W of energy efficiency for high-quality CI applications.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"7 ","pages":"26-29"},"PeriodicalIF":2.7,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139434682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editorial Welcome to the New Editor-in-Chief","authors":"Tony Chan Carusone;Pui-In Mak","doi":"10.1109/LSSC.2023.3331812","DOIUrl":"https://doi.org/10.1109/LSSC.2023.3331812","url":null,"abstract":"My three-year term as the Editor-in-Chief of the IEEE Solid-State Circuits Letters will come to an end on 1 January 2024 and I will step aside. It has been a very rewarding experience to oversee the development of the Letters. The progress we have made has been a collaborative effort, supported by dedicated IEEE staff and over 100 past and present Associate Editors and members of the Editorial Review Board. I am pleased to announce that Prof. Elvis (Pui-In) Mak will take over as the next Editor-in-Chief. With his broad research expertise and leadership qualities, I am confident that the IEEE Solid-State Circuit Letters will maintain its upward trajectory.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"7 ","pages":"1-1"},"PeriodicalIF":2.7,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10359147","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138633859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alessandro Novello;Gabriele Atzeni;Tim Keller;Taekwang Jang
{"title":"A 1.5-GHz Fully Integrated DC–DC Converter Based on Electromagnetically Coupled Class-D LC Oscillators and Resonant LC Flying Impedance Achieving 4.1-W/mm2 Peak Power Density and 77% Peak Efficiency","authors":"Alessandro Novello;Gabriele Atzeni;Tim Keller;Taekwang Jang","doi":"10.1109/LSSC.2023.3341049","DOIUrl":"https://doi.org/10.1109/LSSC.2023.3341049","url":null,"abstract":"This letter introduces a fully integrated DC–DC converter based on electromagnetically coupled class-D LC oscillators (EMLC) manufactured in a 22nm FDSOI CMOS process. The proposed converter implements a resonant LC flying impedance that improves the EMLC output resistance by accomplishing a resonant charge transfer between the flying capacitor CFLY and the load capacitor CO. This design achieves 77% peak efficiency and 4.1 W/mm2 peak power density in a total area of 0.33 mm2. The output voltage is regulated with a duty cycling scheme from 0.003 W/mm2 up to 2.1 W/mm2 with < 2% efficiency loss.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"7 ","pages":"38-41"},"PeriodicalIF":2.7,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139488219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A BiCMOS Active Quencher Using an Inverter-Based Differential Amplifier in the Comparator","authors":"B. Goll;M. Hofbauer;H. Zimmermann","doi":"10.1109/LSSC.2023.3338660","DOIUrl":"https://doi.org/10.1109/LSSC.2023.3338660","url":null,"abstract":"For fast switching off of a firing single-photon avalanche diode (SPAD), an active quenching circuit in 0.35-\u0000<inline-formula> <tex-math>$mu text{m}$ </tex-math></inline-formula>\u0000 BiCMOS technology with a very fast quenching slew rate is introduced. Quenching transients measured at an integrated small prober pad are shown. An NPN transistor as quenching switch leads to an active quenching time of 250 ps and a quenching slew rate of 21.1 V/ns. A self-biased two-inverter differential amplifier used in the comparator makes this fast quenching possible. By the implementation of cascoding, the excess bias voltage of the integrated SPAD can be doubled to 6.6 V with respect to the nominal supply voltage of 3.3 V of the BiCMOS process used. Active resetting of the SPAD is achieved in 725 ps. The power consumption of the BiCMOS quenching circuit is 16.3 mW at 40 Mcounts/s and 3 mW in the idle state.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"7 ","pages":"18-21"},"PeriodicalIF":2.7,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10339666","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139050574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samuel D. Spetalnick;Muya Chang;Shota Konno;Brian Crafton;Ashwin Sanjay Lele;Win-San Khwa;Yu-Der Chih;Meng-Fan Chang;Arijit Raychowdhury
{"title":"A 40-nm Compute-in-Memory Macro With RRAM Addressing IR Drop and Off-State Current","authors":"Samuel D. Spetalnick;Muya Chang;Shota Konno;Brian Crafton;Ashwin Sanjay Lele;Win-San Khwa;Yu-Der Chih;Meng-Fan Chang;Arijit Raychowdhury","doi":"10.1109/LSSC.2023.3338212","DOIUrl":"https://doi.org/10.1109/LSSC.2023.3338212","url":null,"abstract":"This letter describes an analog current-summing compute-in-memory macro using resistive random-access memory (RRAM). The readout transimpedance amplifiers use offset canceling with differential inputs from added sensing paths for the bitline (BL) and sourceline (SL) to minimize channel-to-channel (ch./ch.) gain error while mitigating IR drop in the BL, SL, and multiplexors (MUXes). The analog-to-digital converters (ADCs) use dynamic offset cancelation to remove ch./ch. ADC intrinsic offset and error due to RRAM off-state current. The 64Kb macro implemented with foundry RRAM in 40-nm CMOS has an area of 0.0263 mm2, ch./ch. gain std. dev. of 1.9%, IR drop per-wordline of 0.004%, and 1.1 V efficiency of 7.8–58.8 TOPS/W.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"7 ","pages":"10-13"},"PeriodicalIF":2.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138822143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 6.7–3.6-pJ/b 0.63–7.5-Gb/s Rapid On/Off Clock and Data Recovery With <55-ns Turn-On Time","authors":"Jaya Deepthi Bandarupalli;Saurabh Saxena","doi":"10.1109/LSSC.2023.3337045","DOIUrl":"https://doi.org/10.1109/LSSC.2023.3337045","url":null,"abstract":"In this letter, we present a rapid on/off 0.63–7.5-Gb/s digital clock and data recovery with a low-turn-on time and recovered clock jitter. The clock and data recovery (CDR) employs a fast-on 1.875–3.75-GHz digitally controlled oscillator followed by a \u0000<inline-formula> <tex-math>$2times $ </tex-math></inline-formula>\u0000 integer-N PLL. The DCO incorporates an 8-bit digitally controlled phase interpolator embedded in a \u0000<inline-formula> <tex-math>$6times $ </tex-math></inline-formula>\u0000–\u0000<inline-formula> <tex-math>$12times $ </tex-math></inline-formula>\u0000 injection-locked clock multiplier for fast turn-on and low-output jitter. DCO’s output is filtered using the fast-on PLL while generating the sampling clock phases for the half-rate CDR. Fabricated in the TSMC 65-nm process, the CDR recovers the clock with \u0000<inline-formula> <tex-math>$rm < $ </tex-math></inline-formula>\u00001.3-ps RMS jitter while dissipating 26.6 mW at 7.5 Gb/s and 14.4 mW at 3.75 Gb/s. Duty cycling the CDR operation lowers the average data rates to 0.63 Gb/s with less than 55-ns turn-on time and 1.6-\u0000<inline-formula> <tex-math>$rm mu {mathrm{ s}}$ </tex-math></inline-formula>\u0000 on/off period.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"7 ","pages":"14-17"},"PeriodicalIF":2.7,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138822261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nick Zhang;Young Suk Kim;Peter Hsu;Samsoo Kim;Derek Tao;Hung-Jen Liao;P. W. Wang;Geoffrey Yeap;Quincy Li;Tsung-Yung Jonathan Chang
{"title":"A 4.24-GHz 128×256 SRAM Operating Double Pump Read Write Same Cycle in 5-nm Technology","authors":"Nick Zhang;Young Suk Kim;Peter Hsu;Samsoo Kim;Derek Tao;Hung-Jen Liao;P. W. Wang;Geoffrey Yeap;Quincy Li;Tsung-Yung Jonathan Chang","doi":"10.1109/LSSC.2023.3336773","DOIUrl":"https://doi.org/10.1109/LSSC.2023.3336773","url":null,"abstract":"A High-Speed High-Density 1R1W two port 32Kbit (\u0000<inline-formula> <tex-math>$128times 256$ </tex-math></inline-formula>\u0000) SRAM with single port 6T bitcell macro is proposed. A read-then-write (RTW) double pump CLK generation circuit with tracking bitline (TRKBL) bypassing is proposed to boost read and write performance. A local interlock circuit (LIC) is introduced in Sense-Amp to reduce active power and push Fmax further. To mitigate metal RC degradation, double metal scheme is applied to improve signal integrity and enhance overall operating cycle time. The silicon results show that the slow corner wafer was able to achieve 4.24 GHz at 1.0 V/100 °C in 5-nm FinFET technology.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"7 ","pages":"6-9"},"PeriodicalIF":2.7,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138633812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Minyoung Song;Erwin Allebes;Chris Marshall;Anoop Narayan Bhat;Elbert Bechthum;Johan Dijkhuis;Stefano Traferro;Evgenii Tiurin;Peter Vis;Johan van den Heuvel;Mohieddine El Soussi;Pepijn Boer;Alireza Sheikh;Bernard Meyer;Jiang Liu;Stan van der Ven;Nick Winkel;Martijn Hijdra;Gururaja Kasanadi Ramachandra;Yunus Baykal;Huib Visser;Amirashkan Farsaei;Peng Zhang;Arjan Breeschoten;Yao-Hong Liu;Christian Bachmann
{"title":"A Low-Power 6–9-GHz IEEE 802.15.4a/4z Compliant IR-UWB Transceiver With Pulse Pre-Emphasis Achieving High ToA Precision","authors":"Minyoung Song;Erwin Allebes;Chris Marshall;Anoop Narayan Bhat;Elbert Bechthum;Johan Dijkhuis;Stefano Traferro;Evgenii Tiurin;Peter Vis;Johan van den Heuvel;Mohieddine El Soussi;Pepijn Boer;Alireza Sheikh;Bernard Meyer;Jiang Liu;Stan van der Ven;Nick Winkel;Martijn Hijdra;Gururaja Kasanadi Ramachandra;Yunus Baykal;Huib Visser;Amirashkan Farsaei;Peng Zhang;Arjan Breeschoten;Yao-Hong Liu;Christian Bachmann","doi":"10.1109/LSSC.2023.3335596","DOIUrl":"https://doi.org/10.1109/LSSC.2023.3335596","url":null,"abstract":"This letter presents an IEEE 802.15.4a/4z compliant IR-UWB transceiver for high-precision ranging. By virtue of the proposed digital deserialization–serialization, the TX can generate the intersymbol-interference (ISI)-free IEEE 802.15.4a/4z packet. The proposed analog finite impulse response (FIR)-based TX pre-emphasis improves \u0000<inline-formula> <tex-math>$3.5times $ </tex-math></inline-formula>\u0000 time-of-arrival (ToA) measurement precision without substantial power overhead and fulfills the spectrum requirement of the standard and the worldwide UWB regulations. The presented transceiver consumes 8.7 mW in TX mode and 21 mW in RX mode.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"6 ","pages":"297-300"},"PeriodicalIF":2.7,"publicationDate":"2023-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138577885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rei Sumikawa;Atsutake Kosuge;Yao-Chung Hsu;Kota Shiba;Mototsugu Hamada;Tadahiro Kuroda
{"title":"A183.4-nJ/Inference 152.8-μW 35-Voice Commands Recognition Wired-Logic Processor Using Algorithm-Circuit Co-Optimization Technique","authors":"Rei Sumikawa;Atsutake Kosuge;Yao-Chung Hsu;Kota Shiba;Mototsugu Hamada;Tadahiro Kuroda","doi":"10.1109/LSSC.2023.3334625","DOIUrl":"https://doi.org/10.1109/LSSC.2023.3334625","url":null,"abstract":"A 183.4-nJ/inference single-chip wired-logic DNN processor that is capable of recognizing all 35 commands defined in the industrial standard voice recognition data set (Google speech command dataset) is developed. The algorithm-circuit co-optimized processor recognizes 3.5 times more commands with 1.6 times better-energy efficiency than the state-of-the-art analog processor while keeping design cost low. By implementing all the processing circuits and wiring required for the 16-layer DNN onto a single chip (\u0000<inline-formula> <tex-math>$7.63 {mathrm{ mm}}^{2}$ </tex-math></inline-formula>\u0000 in 40 nm), the need to store weight coefficients and intermediate data in DRAM/SRAM is eliminated. Owing to the proposed architecture, a low-power consumption of \u0000<inline-formula> <tex-math>$152.8 mu text{W}$ </tex-math></inline-formula>\u0000 is achieved, which is low enough for always-on applications on battery-powered IoT devices.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"7 ","pages":"22-25"},"PeriodicalIF":2.7,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139419472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SiGe BiCMOS D-Band Heterodyne Power Mixer With Back-Off Efficiency Enhanced by Current Clamping","authors":"Andrea Bilato;Ibrahim Petricli;Andrea Mazzanti","doi":"10.1109/LSSC.2023.3332766","DOIUrl":"10.1109/LSSC.2023.3332766","url":null,"abstract":"A D-band power upconverter in a 55-nm SiGe BiCMOS is presented. The low-output resistance of a switching quad is identified as a limiting factor to mixer power generation in D-band, and common-base transistors are stacked for output power enhancement. Moreover, the current clamping mechanism is exploited to scale the average supply current with output power, improving the efficiency in back-off. Experimental results demonstrate \u0000<inline-formula> <tex-math>$ {P_{mathrm{ sat}}},,{=}$ </tex-math></inline-formula>\u00006.3 dBm and \u0000<inline-formula> <tex-math>${oP_{mathrm{ 1dB}}},,{=}$ </tex-math></inline-formula>\u00004.5 dBm at 140 GHz, with efficiency of 3.05% and 2.47%, respectively. The power consumption, from a 2-V supply, rises from 70 mW at the quiescent point to 140 mW at \u0000<inline-formula> <tex-math>$ {P_{mathrm{ sat}}}$ </tex-math></inline-formula>\u0000. The measured output power and efficiency compare favorably against previous works.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"7 ","pages":"2-5"},"PeriodicalIF":2.7,"publicationDate":"2023-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135759394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}