{"title":"DATIC: A Data-Aware Time-Domain Computing-in-Memory-Based CNN Processor With Dynamic Channel Skipping and Mapping","authors":"Jianxun Yang;Yuyao Kong;Yixuan Li;Chenfu Guo;Hao Sun;Leibo Liu;Shaojun Wei;Jun Yang;Shouyi Yin","doi":"10.1109/OJSSCS.2022.3216562","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3216562","url":null,"abstract":"Due to the low-power priority of analog delay-based computation, time-domain computing-in-memory (TD-CIM) presents a splendid potential for energy-constrained edge and IoT scenarios deploying convolutional neural networks (CNNs). However, the latency in delay-based computation is proportional to the numbers and values of multiplications-and-accumulations (MACs), bottlenecking the throughput of previous data-agnostic TD-CIM-based processors which compute complete convolutions in a fixed MAC mapping manner. First, some output activations in each layer of CNNs contribute less to the final classification results, which are insignificant and can be substituted by sums of partial MACs, with a marginal accuracy degradation. Thus, complete convolution computations lead to redundant MACs. Second, activations and weights vary with input images and models. Fixed MAC mapping leads to unbalanced MAC values on delay chains, causing long idle time and latency. To address that, we design a data-aware TD-CIM-based CNN processor, DATIC, with three techniques to reduce latency: 1) a channel-skipping TD-CIM macro to remove redundant MACs for insignificant output activations (IOAs), by storing activations stationary in SRAM bitcells and shifting weights to perform only imperative MACs; 2) a convolution-order programming unit to reduce overhead of skipping redundant MACs for IOAs with random positions on feature maps; and 3) an activation-weight-adaptive channel-mapping scheduler to balance the latency of delay chains by dynamically altering the convolution mapping manner. Implemented under TSMC 28-nm technology, DATIC achieves 622.9-GOPS throughput and 32.7-TOPS/W energy efficiency for ResNet-18 with 2-b weights and 8-b activations.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"244-258"},"PeriodicalIF":0.0,"publicationDate":"2022-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09927338.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Patrick P. Mercier;Benton H. Calhoun;Po-Han Peter Wang;Anjana Dissanayake;Linsheng Zhang;Drew A. Hall;Steven M. Bowers
{"title":"Low-Power RF Wake-Up Receivers: Analysis, Tradeoffs, and Design","authors":"Patrick P. Mercier;Benton H. Calhoun;Po-Han Peter Wang;Anjana Dissanayake;Linsheng Zhang;Drew A. Hall;Steven M. Bowers","doi":"10.1109/OJSSCS.2022.3215099","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3215099","url":null,"abstract":"Wake-up receivers (WuRXs) offer a potentially energy-efficient means to enable asynchronous wake-up of higher power and higher performance radios without needing frequent (often energy-expensive) synchronization. Since WuRXs are typically on for a large percentage of the time, keeping their power consumption very low is critical to minimizing the total energy draw. However, this is difficult while maintaining good sensitivity, interference resiliency, and robustness, all with application-appropriate wake-up latencies and form factors. This article reviews the main challenges facing WuRXs, outlines the most popular WuRX architectures, and details essential design techniques and tradeoffs toward enabling utility in emerging applications.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"144-164"},"PeriodicalIF":0.0,"publicationDate":"2022-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09923621.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An All LTPS-TFT-Based Charge-Integrating Amplifier for Sensor-Array Readout Circuit on Flexible Substrate","authors":"Mohit Dandekar;Kris Myny;Wim Dehaene","doi":"10.1109/OJSSCS.2022.3213772","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3213772","url":null,"abstract":"This article presents the design of a readout circuit for charge-output sensor arrays integrated on a flexible substrate. The charge-integrating amplifier is built with a current-output transimpedance amplifier that includes the integrator function with reset. The charge-integrating amplifier has a fully differential internal topology, improving over single-ended design, including the feedback amplifier implemented specifically as a Nauta-transconductor. The readout circuit has been manufactured in a 3-\u0000<inline-formula> <tex-math>$mu text{m}$ </tex-math></inline-formula>\u0000 low-temperature polysilicon process on foil and measured, achieving a bandwidth of 200 kHz, operation at a 5-V supply while consuming 586-\u0000<inline-formula> <tex-math>$mu text{W}$ </tex-math></inline-formula>\u0000 power and maintaining a maximum integral nonlinearity of 5%.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"208-216"},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09919191.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shovon Dey;Can Ni;Alberto Leon Cevallos;Raju Machupalli;Mrinal Mandal;Masum Hossain
{"title":"Sparsity-Aware 25-Gb/s Memory Link With 0.0375-pJ/bit Signaling Efficiency for Machine Learning Hardware","authors":"Shovon Dey;Can Ni;Alberto Leon Cevallos;Raju Machupalli;Mrinal Mandal;Masum Hossain","doi":"10.1109/OJSSCS.2022.3213633","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3213633","url":null,"abstract":"This work describes a multiplication and accumulation (MAC) accelerator integrated with a memory interface. The link is designed to take advantage of naturally existing sparsity in a neural network. The link operating at 16 Gb/s achieves 0.1875-pJ/bit signaling efficiency for random data but, for sparse data, signaling efficiency can improve to 0.0375 pJ/bit. Similarly, the MAC unit accelerates the computation utilizing the phase domain accumulation process and provides a 40% improvement in energy efficiency for sparse data and at the same achieves inference accuracy of 94% for the MNIST data set.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"276-287"},"PeriodicalIF":0.0,"publicationDate":"2022-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09916077.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50327147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cryogenic Controller for Electrostatically Controlled Quantum Dots in 22-nm Quantum SoC","authors":"Robert Bogdan Staszewski;Ali Esmailiyan;Hongying Wang;Eugene Koskin;Panagiotis Giounanlis;Xutong Wu;Anna Koziol;Andrii Sokolov;Imran Bashir;Mike Asker;Dirk Leipold;Reza Nikandish;Teerachot Siriburanon;Elena Blokhina","doi":"10.1109/OJSSCS.2022.3213528","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3213528","url":null,"abstract":"We present a fully integrated cryogenic controller for electrostatically controlled quantum dots (QDs) implemented in a commercial 22-nm fully depleted silicon-on-insulator CMOS process and operating in a quantum regime. The QDs are realized in local well areas of transistors separated by tunnel barriers controlled by voltages applied to gate terminals. The QD arrays (QDA) are co-located with the control circuitry inside each quantum experiment cell, with a total of 28 of such cells comprising this system-on-chip (SoC). The QDA structure is controlled by small capacitive digital-to-analog converters (CDACs) and its quantum state is measured by a single-electron detector. The SoC operates at a cryogenic temperature of 3.4K. The occupied area of each QDA is \u0000<inline-formula> <tex-math>$0.7 times 0.4mu text{m}^2$ </tex-math></inline-formula>\u0000, while each QD occupies only \u0000<inline-formula> <tex-math>$20 times 80 text{nm}^2$ </tex-math></inline-formula>\u0000. The low power and miniaturized area of these circuits are an important step on the way for integration of a large quantum core with millions of QDs, required for practical quantum computers. The performance and functionality of the CDAC are validated in a loop-back mode with the detector sensing the CDAC-compelled electron tunneling from the quantum point contact (QPC) node into the quantum structure. The position of the injected charge inside the QDA is intended to be controlled through the CDAC codes and programmable pulse width. Quantum effects are shown by an experimental characterization of charge injection and quantization into the QDA consisting of three coupled QDs. The charge can be transferred to a QD and sensed at the QPC, and this process is controlled by the relevant voltages and CDACs.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"103-121"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09915422.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Shi;Jiaxin Liu;Abhishek Mukherjee;Xiangxing Yang;Xiyuan Tang;Linxiao Shen;Wenda Zhao;Nan Sun
{"title":"A 3.7-mW 12.5-MHz 81-dB SNDR 4th-Order Continuous-Time DSM With Single-OTA and 2nd-Order Noise-Shaping SAR","authors":"Wei Shi;Jiaxin Liu;Abhishek Mukherjee;Xiangxing Yang;Xiyuan Tang;Linxiao Shen;Wenda Zhao;Nan Sun","doi":"10.1109/OJSSCS.2022.3212333","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3212333","url":null,"abstract":"This article presents a hybrid 4th-order delta–sigma modulator (DSM). It combines a continuous-time (CT) loop filter and a discrete-time (DT) passive 2nd-order noise-shaping SAR (NS-SAR). Since the 2nd-order NS-SAR is robust against PVT variation, the stability of this 4th-order DSM is similar to that of a 2nd-order CT-DSM. The CT loop filter is based on single-amplifier–biquad (SAB) structure. As a result, only one OTA is used to achieve 4th-order noise shaping, leading to a high power efficiency. Moreover, this work implements both excess-loop delay (ELD) compensation and an input feedforward path inside the NS-SAR in the charge domain, further reducing the circuit complexity and the OTA power. Overall, this work achieves 81-dB SNDR over 12.5 MHz with 3.7-mW power, leading to a Schreier FoM of 176 dB.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"122-134"},"PeriodicalIF":0.0,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09913224.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Introduction to High Sample Rate Nyquist Analog-to-Digital Converters","authors":"Gabriele Manganaro","doi":"10.1109/OJSSCS.2022.3212028","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3212028","url":null,"abstract":"Increasingly wider band analog signals found in multiple information and communication technology applications, requiring real-time digital signal processing, demand analog-to-digital converters with ever higher sample rate. Several innovative techniques, from the circuit level, to architecture and algorithms, have enabled remarkable breakthroughs in a relatively short span of time. This overview article aims to introduce this topic and to point to some of the most notable results, while also highlighting open problems and engineering trends.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"82-102"},"PeriodicalIF":0.0,"publicationDate":"2022-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09911689.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low-Power SAR ADCs: Basic Techniques and Trends","authors":"Pieter Harpe","doi":"10.1109/OJSSCS.2022.3211482","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3211482","url":null,"abstract":"With the advent of small, battery-powered devices, power efficiency has become of paramount importance. For analog-to-digital converters (ADCs), the successive approximation register (SAR) architecture plays a prominent role thanks to its ability to combine power efficiency with a simple architecture, a broad application scope, and technology portability. In this review article, the basic design challenges for low-power SAR ADCs are summarized and several design techniques are illustrated. Furthermore, the limitations of SAR ADCs are outlined and hybrid architecture trends, such as noise-shaping SAR ADCs and pipelined SAR ADCs, are briefly introduced and clarified with examples.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"73-81"},"PeriodicalIF":0.0,"publicationDate":"2022-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09908164.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DARKSIDE: A Heterogeneous RISC-V Compute Cluster for Extreme-Edge On-Chip DNN Inference and Training","authors":"Angelo Garofalo;Yvan Tortorella;Matteo Perotti;Luca Valente;Alessandro Nadalini;Luca Benini;Davide Rossi;Francesco Conti","doi":"10.1109/OJSSCS.2022.3210082","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3210082","url":null,"abstract":"On-chip deep neural network (DNN) inference and training at the Extreme-Edge (TinyML) impose strict latency, throughput, accuracy, and flexibility requirements. Heterogeneous clusters are promising solutions to meet the challenge, combining the flexibility of DSP-enhanced cores with the performance and energy boost of dedicated accelerators. We present DARKSIDE, a System-on-Chip with a heterogeneous cluster of eight RISC-V cores enhanced with 2-b to 32-b mixed-precision integer arithmetic. To boost the performance and efficiency on key compute-intensive DNN kernels, the cluster is enriched with three digital accelerators: 1) a specialized engine for low-data-reuse depthwise convolution kernels (up to 30 MAC/cycle); 2) a minimal overhead datamover to marshal 1–32-b data on-the-fly; and 3) a 16-b floating-point tensor product engine (TPE) for tiled matrix-multiplication acceleration. DARKSIDE is implemented in 65-nm CMOS technology. The cluster achieves a peak integer performance of 65 GOPS and a peak efficiency of 835 GOPS/W when working on 2-b integer DNN kernels. When targeting floating-point tensor operations, the TPE provides up to 18.2 GFLOPS of performance or 300 GFLOPS/W of efficiency—enough to enable on-chip floating-point training at competitive speed coupled with ultralow power quantized inference.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"231-243"},"PeriodicalIF":0.0,"publicationDate":"2022-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09903915.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Auto-Reconfigurable Multi-Output Regulating Switched-Capacitor DC–DC Converter for Wireless Power Reception and Distribution in Multi-Unit Implantable Devices","authors":"Unbong Lee;Wanyeong Jung;Sohmyung Ha;Minkyu Je","doi":"10.1109/OJSSCS.2022.3202145","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3202145","url":null,"abstract":"An automatically reconfigurable switched-capacitor DC-DC converter with multiple regulated outputs is presented for wireless-powered multi-unit implantable medical devices (IMDs). In such devices, the main controller unit is powered wirelessly and provides supply voltages to the circuits of the main unit as well as multiple connected sub-units. The proposed DC-DC converter simultaneously generates two regulated voltages for the main unit and two unregulated voltages for the sub-units, which have on-site low-dropout regulators. The converter consists of i) an input-adaptive DC-DC conversion stage with two switched-capacitor (SC) DC-DC converters in series and ii) a regulating stage. In the DC-DC conversion stage, the proposed converter automatically reconfigures the conversion ratio and connection order of the two SC DC-DC converters and selects the output nodes by load selection switches depending on the input level. Thanks to these adaptive configurations, the proposed converter offers high conversion efficiencies over a wide input voltage range even with fewer flying capacitors required for the reconfigurable conversion ratios. Moreover, the selection switches are reused to regulate the output voltages to desired levels, minimizing the overhead for subsequent regulation. The IC fabricated in a 180-nm standard CMOS process achieves a conversion efficiency of 95.5% for the unregulated voltages and up to 77.4% for the regulated voltages over a wide input range of 1 V to 4 V with 0.74-mV output ripple for a load current of 20 mA, while providing four outputs (2 regulated, 2 unregulated).","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"3 ","pages":"65-75"},"PeriodicalIF":0.0,"publicationDate":"2022-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/10019316/09868089.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67861768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}