Enrico Genco;Marco Fattori;Pieter J. A. Harpe;Francesco Modena;Fabrizio Antonio Viola;Mario Caironi;May Wheeler;Guillaume Fichet;Fabrizio Torricelli;Lucia Sarcina;Eleonora Macchia;Luisa Torsi;Eugenio Cantatore
{"title":"A 4 × 4 Biosensor Array With a 42-μW/Channel Multiplexed Current Sensitive Front-End Featuring 137-dB DR and Zeptomolar Sensitivity","authors":"Enrico Genco;Marco Fattori;Pieter J. A. Harpe;Francesco Modena;Fabrizio Antonio Viola;Mario Caironi;May Wheeler;Guillaume Fichet;Fabrizio Torricelli;Lucia Sarcina;Eleonora Macchia;Luisa Torsi;Eugenio Cantatore","doi":"10.1109/OJSSCS.2022.3217231","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3217231","url":null,"abstract":"This article presents a multiplexed current sensitive readout for label-free zeptomolar-sensitive detectors realized with large-area electrolyte-gated organic thin-film transistors (EGOFETs). These highly capacitive biosensors are multiplexed using an organic thin-film transistor (OTFT) line driver and OTFT switches and interfaced to a 65-nm Si CMOS, low-power, pA-sensitive front-end. The Si chip performs analog-to-digital conversion and data transmission to a microcontroller too. A current domain interface is used to transmit the signals coming from multiple biosensors to the 1.2-V supply CMOS Si-IC via the 30-V supply OTFT electronics. Exploiting an analog module implemented in the Si-IC, the EGOFETs are precisely biased, even in the presence of a large OTFT multiplexer resistance. The CMOS current sensitive front-end achieves a dynamic range (DR) of 137 dB and a power consumption of 42-\u0000<inline-formula> <tex-math>$mu text{W}$ </tex-math></inline-formula>\u0000 per channel reaching a state-of-the-art DR-power-bandwidth FOM of 208 dB. The front-end has been designed with a first-stage programmable-gain, active-feedback transimpedance amplifier topology that, contrary to common current-sensitive front-end solutions, is not affected by the sensor capacitance. The system has been validated with different concentrations of human IgG and IgM proteins using both a single sensor and a 4 \u0000<inline-formula> <tex-math>$times $ </tex-math></inline-formula>\u0000 4 array of EGOFETs. Thanks to the multiplexing strategy and the low costs of its modules, the system here presented has the potential to enable widespread use of precision diagnostic with extreme sensitivity even in point-of-care and low-resource settings.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"193-207"},"PeriodicalIF":0.0,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09940322.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy-Efficient DNN Training Processors on Micro-AI Systems","authors":"Donghyeon Han;Sanghoon Kang;Sangyeob Kim;Juhyoung Lee;Hoi-Jun Yoo","doi":"10.1109/OJSSCS.2022.3219034","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3219034","url":null,"abstract":"Many edge/mobile devices are now able to utilize deep neural networks (DNNs) thanks to the development of mobile DNN accelerators. Mobile DNN accelerators overcame the problems of limited computing resources and battery capacity by realizing energy-efficient inference. However, its passive behavior makes it difficult for DNN to provide active customization for individual users or its service environment. The importance of on-chip training is rising more and more to provide active interaction between DNN processors and ever-changing surroundings or conditions. Despite its advantages, the DNN training has more constraints than the inference such that it was considered impractical to be realized on mobile/edge devices. Recently, there are many trials to realize mobile DNN training, and a number of prior works will be summarized. First, it arranges the new challenges of the DNN accelerator induced by training functionality and discusses new hardware features related to the challenges. Second, it explains algorithm-hardware co-optimization methods and explains why it becomes mainstream in mobile DNN training research. Third, it compares the main differences between the conventional inference accelerators and recent training processors. Finally, the conclusion is made by proposing the future directions of the DNN training processor in micro-AI systems.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"259-275"},"PeriodicalIF":0.0,"publicationDate":"2022-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09935273.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50415865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Reconfigurable Power-Efficient Quantized Analog RF Front-End With Smart Calibration","authors":"Justin Yonghui Kim;Antonio Liscidini","doi":"10.1109/OJSSCS.2022.3218494","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3218494","url":null,"abstract":"A power-scalable RF front-end using quantized analog signal processing is presented. The front-end is based on a voltage-mode power-scalable approach which allows the power dissipation to be scaled upon the operative scenario and to perform an agile calibration for mismatch impairments. Power and input dynamic range can be scaled upon the desired 1-dB compression point (1dBCP) (from −15.3 to 0.5 dBm) while keeping the same sensitivity with 2.5-dB NF. Signal path power can vary between 3.3 and 6.4 mW while clock generation and distribution power can vary between 1.6 and 18.5 mW/GHz, with a phase noise as low as −171.2 dBc/Hz. After calibration, IM2 and IM3 improved up to 33 dB while 1dBCP improved by 1 dB, which resulted in achieving an IIP3 of 26.1 dBm and IIP2 of 71 dBm at 0-dBm 1dBCP.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"165-174"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09933817.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yue Ma;Can Wu;Nicholas M. Fata;Prakhar Kumar;Sigurd Wagner;James C. Sturm;Naveen Verma
{"title":"Device, Circuit, and System Design for Enabling Giga-Hertz Large-Area Electronics","authors":"Yue Ma;Can Wu;Nicholas M. Fata;Prakhar Kumar;Sigurd Wagner;James C. Sturm;Naveen Verma","doi":"10.1109/OJSSCS.2022.3217759","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3217759","url":null,"abstract":"Recent progress has substantially increased the operating frequency of large-area electronic (LAE) devices. Their integration into circuits has enabled unprecedented system-level capabilities, toward future wireless applications for the Internet of Things (IoT) and 5G/6G. These exploit large dimensions and flexible form factors. In this work, we focus on giga-Hertz (GHz) zinc-oxide (ZnO) thin-film transistors (TFTs) as a foundational device for enabling GHz LAE circuits and systems. To further understand their operation and limits in the newly possible frequency regime, we incorporate the effects of temperature and of non-quasi-static (NQS) physics into the device models. We then analyze operation including these effects on a fundamental circuit block, the cross-coupled inductor-capacitor (LC) oscillator. It is used in representative LAE systems, namely, a 13.56-MHz radio-frequency identification (RFID) reader array for near-field energy transfer, and a 1-GHz phased array for far-field radiation beam steering. The co-design of devices, circuits, and systems is essential for achieving flexible and meter-scale monolithic-integrated LAE wireless systems. For these, understanding temperature limitations and the NQS effect is crucial.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"177-192"},"PeriodicalIF":0.0,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09933352.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Enne Wittenhagen;Patrick James Artz;Philipp Scholz;Friedel Gerfers
{"title":"A 3-GS/s RF Track-and-Hold Amplifier Utilizing Body-Biasing With >55-dBFS SNR and >67-dBc SFDR Up to 3 GHz in 22-nm CMOS SOI","authors":"Enne Wittenhagen;Patrick James Artz;Philipp Scholz;Friedel Gerfers","doi":"10.1109/OJSSCS.2022.3217019","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3217019","url":null,"abstract":"In this article, a 3-GS/s time-interleaved (TI) RF track-and-hold (TaH) amplifier designed in a 22-nm SOI technology is presented. The TaH amplifier is designed to drive an ADC, which can be either two pipeline-ADCs or two rows of SAR-ADCs. Both TI TaH are driven by a single RF-matched wide-band bulk-controlled front-end (FE) buffer. The measured TaH amplifier has an SFDR beyond 70 dBc up to 2.5 GHz and remains above 67 dBc till 3 GHz enabling subsampling. An overall system bandwidth of 4.5 GHz is achieved with an SNR above 55 dBFS. The ultralow-jitter clock regeneration has only 45 fs rms jitter not limiting the SNR up to 3 GHz. Two-tone and multitone measurements reveal a third intermodulation and interband nonlinearity with >72 and >82 dBFS, respectively. Off-chip calibration of offset/gain mismatch and time-skew between both TaH-lanes reduce interleaving spurs >75 dBFS utilizing a 37-tap fractional delay FIR filter. The efficient body-bias control of the technology is used to dynamically body-bias the TaH sample-switch increasing bandwidth by 10% improving settling performance while at the same time the leakage decreases. Static body-biasing is also applied to the common-mode feedback by using the bulk as a control node. The TaH amplifier including the clock generation consumes only 178 mW from a triple 2 V/0.9 V/−0.8 V supply.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"135-143"},"PeriodicalIF":0.0,"publicationDate":"2022-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09928330.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sang Min Lee;Hanjoon Kim;Jeseung Yeon;Juyun Lee;Younggeun Choi;Minho Kim;Changjae Park;Kiseok Jang;Youngsik Kim;Yongseung Kim;Changman Lee;Hyuck Han;Won Eung Kim;Rui Tang;Joon Ho Baek
{"title":"A 64-TOPS Energy-Efficient Tensor Accelerator in 14nm With Reconfigurable Fetch Network and Processing Fusion for Maximal Data Reuse","authors":"Sang Min Lee;Hanjoon Kim;Jeseung Yeon;Juyun Lee;Younggeun Choi;Minho Kim;Changjae Park;Kiseok Jang;Youngsik Kim;Yongseung Kim;Changman Lee;Hyuck Han;Won Eung Kim;Rui Tang;Joon Ho Baek","doi":"10.1109/OJSSCS.2022.3216798","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3216798","url":null,"abstract":"For energy-efficient accelerators in data centers that leverage advances in the performance and energy efficiency of recent algorithms, flexible architectures are critical to support state-of-the-art algorithms for various deep learning tasks. Due to the matrix multiplication units at the core of tensor operations, most recent programmable architectures lack flexibility for layers with diminished dimensions, especially for inferences where a large batch axis is rarely allowed. In addition, exploiting the data reuse inherent within tensor operations for computing a single matrix multiplication is challenging. In this work, an extension of a vector processor in 14 nm is proposed, which is customized to tensor operations. The flexible architecture enables a tensorized loop to support various data layouts and different shapes and sizes of tensor operations. It also exploits all possible data reuse, including input, weight, and output. Based on the tensorized loop, fetch and reduction networks, which unicast or multicast with the ordering of both input data and processing data, can be simplified using a circuit-switching-like network with configured topology and flow control for each tensor operation. Two processing elements can be fused to optimize latency for a large model or can operate individually for throughput. As a result, various state-of-the-art models can be processed efficiently with straightforward compiler optimization, and the highest energy efficiency of 13.4Inferences/s/W on EfficientNetV2-S is demonstrated.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"219-230"},"PeriodicalIF":0.0,"publicationDate":"2022-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09927346.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DATIC: A Data-Aware Time-Domain Computing-in-Memory-Based CNN Processor With Dynamic Channel Skipping and Mapping","authors":"Jianxun Yang;Yuyao Kong;Yixuan Li;Chenfu Guo;Hao Sun;Leibo Liu;Shaojun Wei;Jun Yang;Shouyi Yin","doi":"10.1109/OJSSCS.2022.3216562","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3216562","url":null,"abstract":"Due to the low-power priority of analog delay-based computation, time-domain computing-in-memory (TD-CIM) presents a splendid potential for energy-constrained edge and IoT scenarios deploying convolutional neural networks (CNNs). However, the latency in delay-based computation is proportional to the numbers and values of multiplications-and-accumulations (MACs), bottlenecking the throughput of previous data-agnostic TD-CIM-based processors which compute complete convolutions in a fixed MAC mapping manner. First, some output activations in each layer of CNNs contribute less to the final classification results, which are insignificant and can be substituted by sums of partial MACs, with a marginal accuracy degradation. Thus, complete convolution computations lead to redundant MACs. Second, activations and weights vary with input images and models. Fixed MAC mapping leads to unbalanced MAC values on delay chains, causing long idle time and latency. To address that, we design a data-aware TD-CIM-based CNN processor, DATIC, with three techniques to reduce latency: 1) a channel-skipping TD-CIM macro to remove redundant MACs for insignificant output activations (IOAs), by storing activations stationary in SRAM bitcells and shifting weights to perform only imperative MACs; 2) a convolution-order programming unit to reduce overhead of skipping redundant MACs for IOAs with random positions on feature maps; and 3) an activation-weight-adaptive channel-mapping scheduler to balance the latency of delay chains by dynamically altering the convolution mapping manner. Implemented under TSMC 28-nm technology, DATIC achieves 622.9-GOPS throughput and 32.7-TOPS/W energy efficiency for ResNet-18 with 2-b weights and 8-b activations.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"244-258"},"PeriodicalIF":0.0,"publicationDate":"2022-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09927338.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Patrick P. Mercier;Benton H. Calhoun;Po-Han Peter Wang;Anjana Dissanayake;Linsheng Zhang;Drew A. Hall;Steven M. Bowers
{"title":"Low-Power RF Wake-Up Receivers: Analysis, Tradeoffs, and Design","authors":"Patrick P. Mercier;Benton H. Calhoun;Po-Han Peter Wang;Anjana Dissanayake;Linsheng Zhang;Drew A. Hall;Steven M. Bowers","doi":"10.1109/OJSSCS.2022.3215099","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3215099","url":null,"abstract":"Wake-up receivers (WuRXs) offer a potentially energy-efficient means to enable asynchronous wake-up of higher power and higher performance radios without needing frequent (often energy-expensive) synchronization. Since WuRXs are typically on for a large percentage of the time, keeping their power consumption very low is critical to minimizing the total energy draw. However, this is difficult while maintaining good sensitivity, interference resiliency, and robustness, all with application-appropriate wake-up latencies and form factors. This article reviews the main challenges facing WuRXs, outlines the most popular WuRX architectures, and details essential design techniques and tradeoffs toward enabling utility in emerging applications.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"144-164"},"PeriodicalIF":0.0,"publicationDate":"2022-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09923621.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An All LTPS-TFT-Based Charge-Integrating Amplifier for Sensor-Array Readout Circuit on Flexible Substrate","authors":"Mohit Dandekar;Kris Myny;Wim Dehaene","doi":"10.1109/OJSSCS.2022.3213772","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3213772","url":null,"abstract":"This article presents the design of a readout circuit for charge-output sensor arrays integrated on a flexible substrate. The charge-integrating amplifier is built with a current-output transimpedance amplifier that includes the integrator function with reset. The charge-integrating amplifier has a fully differential internal topology, improving over single-ended design, including the feedback amplifier implemented specifically as a Nauta-transconductor. The readout circuit has been manufactured in a 3-\u0000<inline-formula> <tex-math>$mu text{m}$ </tex-math></inline-formula>\u0000 low-temperature polysilicon process on foil and measured, achieving a bandwidth of 200 kHz, operation at a 5-V supply while consuming 586-\u0000<inline-formula> <tex-math>$mu text{W}$ </tex-math></inline-formula>\u0000 power and maintaining a maximum integral nonlinearity of 5%.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"208-216"},"PeriodicalIF":0.0,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09919191.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"67868136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shovon Dey;Can Ni;Alberto Leon Cevallos;Raju Machupalli;Mrinal Mandal;Masum Hossain
{"title":"Sparsity-Aware 25-Gb/s Memory Link With 0.0375-pJ/bit Signaling Efficiency for Machine Learning Hardware","authors":"Shovon Dey;Can Ni;Alberto Leon Cevallos;Raju Machupalli;Mrinal Mandal;Masum Hossain","doi":"10.1109/OJSSCS.2022.3213633","DOIUrl":"https://doi.org/10.1109/OJSSCS.2022.3213633","url":null,"abstract":"This work describes a multiplication and accumulation (MAC) accelerator integrated with a memory interface. The link is designed to take advantage of naturally existing sparsity in a neural network. The link operating at 16 Gb/s achieves 0.1875-pJ/bit signaling efficiency for random data but, for sparse data, signaling efficiency can improve to 0.0375 pJ/bit. Similarly, the MAC unit accelerates the computation utilizing the phase domain accumulation process and provides a 40% improvement in energy efficiency for sparse data and at the same achieves inference accuracy of 94% for the MNIST data set.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"2 ","pages":"276-287"},"PeriodicalIF":0.0,"publicationDate":"2022-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8782712/9733783/09916077.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50327147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}