Arun Manickam, Kirsten A. Johnson, Rituraj Singh, Nicholas Wood, Edmond Ku, A. Cuppoletti, M. McDermott, A. Hassibi
{"title":"Multiplex PCR CMOS Biochip for Detection of Upper Respiratory Pathogens including SARS-CoV-2","authors":"Arun Manickam, Kirsten A. Johnson, Rituraj Singh, Nicholas Wood, Edmond Ku, A. Cuppoletti, M. McDermott, A. Hassibi","doi":"10.23919/VLSICircuits52068.2021.9492353","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492353","url":null,"abstract":"A 1024-pixel CMOS biochip for multiplex polymerase chain reaction application is presented. Biosensing pixels include 137dB DDR photosensors and an integrated emission filter with OD~6 to perform real-time fluorescence-based measurements while thermocycling the reaction chamber with heating and cooling rates of > ±10°C/s. The surface of the CMOS IC is biofunctionalized with DNA capturing probes. The biochip is integrated into a fluidic consumable enabling loading of extracted nucleic acid samples and the detection of upper respiratory pathogens, including SARS-CoV-2.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115675997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Fully Integrated Switched-Capacitor Voltage Regulator with Multi-Rate Successive Approximation Achieving 190 ps Transient FoM and 83.7% Conversion Efficiency","authors":"Bing-Chen Wu, Tsung-Te Liu","doi":"10.23919/VLSICircuits52068.2021.9492333","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492333","url":null,"abstract":"This paper presents a fully integrated switched-capacitor dc–dc voltage regulator (SCVR) in standard 28 nm CMOS with a proposed regulation algorithm of multi-rate successive approximation (MRSA) and several conversion efficiency enhancement techniques. The proposed SCVR achieves 190 ps transient FoM with peak conversion efficiency of 83.7%@114.2 mA/mm2 and 110× supported loading range of 80 μA–8.8 mA.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125373543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyunsoo Song, Sungjin Oh, Juan Salinas, Sung-Yun Park, E. Yoon
{"title":"A 5.1ms Low-Latency Face Detection Imager with In-Memory Charge-Domain Computing of Machine-Learning Classifiers","authors":"Hyunsoo Song, Sungjin Oh, Juan Salinas, Sung-Yun Park, E. Yoon","doi":"10.23919/VLSICircuits52068.2021.9492432","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492432","url":null,"abstract":"We present a CMOS imager for low-latency face detection empowered by parallel imaging and computing of machine-learning (ML) classifiers. The energy-efficient parallel operation and multi-scale detection eliminate image capture delay and significantly alleviate backend computational loads. The proposed pixel architecture, composed of dynamic samplers in a global shutter (GS) pixel array, allows for energy-efficient in-memory charge-domain computing of feature extraction and classification. The illumination-invariant detection was realized by using log-Haar features. A prototype 240×240 imager achieved an on-chip face detection latency of 5.1ms with a 97.9% true positive rate and 2% false positive rate at 120fps. Moreover, a dynamic nature of in-memory computing allows an energy efficiency of 419pJ/pixel for feature extraction and classification, leading to the smallest latency-energy product of 3.66ms∙nJ/pixel with digital backend processing.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126954303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nachiket V. Desai, H. Krishnamurthy, William J. Lambert, Jingshu Yu, H. Then, N. Butzen, Sheldon Weng, C. Schaef, N. Nidhi, M. Radosavljevic, J. Rode, J. Sandford, K. Radhakrishnan, K. Ravichandran, B. Sell, J. Tschanz, V. De
{"title":"A 32A 5V-Input, 94.2% Peak Efficiency High-Frequency Power Converter Module Featuring Package-Integrated Low-Voltage GaN NMOS Power Transistors","authors":"Nachiket V. Desai, H. Krishnamurthy, William J. Lambert, Jingshu Yu, H. Then, N. Butzen, Sheldon Weng, C. Schaef, N. Nidhi, M. Radosavljevic, J. Rode, J. Sandford, K. Radhakrishnan, K. Ravichandran, B. Sell, J. Tschanz, V. De","doi":"10.23919/VLSICircuits52068.2021.9492350","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492350","url":null,"abstract":"A 5V-input, high-frequency, high-density (9A/mm2) buck converter featuring a low-voltage GaN power transistor (with 5-10× better FoM than Si) with on-die gate clamps, integrated with a CMOS companion die in 4mm × 4mm package, achieves 94.2% peak efficiency for 5Vin/1Vout at 3MHz switching frequency with a 40nH inductor.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122630851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Baibhab Chatterjee, K. G. Kumar, Mayukh Nath, Shulan Xiao, Nirmoy Modak, D. Das, Jayant Krishna, Shreyas Sen
{"title":"A 1.15μW 5.54mm3 Implant with a Bidirectional Neural Sensor and Stimulator SoC utilizing Bi-Phasic Quasi-static Brain Communication achieving 6kbps-10Mbps Uplink with Compressive Sensing and RO-PUF based Collision Avoidance","authors":"Baibhab Chatterjee, K. G. Kumar, Mayukh Nath, Shulan Xiao, Nirmoy Modak, D. Das, Jayant Krishna, Shreyas Sen","doi":"10.23919/VLSICircuits52068.2021.9492445","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492445","url":null,"abstract":"To solve the challenge of powering and communication in a brain implant with low end-end energy loss, we present Bi-Phasic Quasi-static Brain Communication (BP-QBC), achieving < 60dB worst-case channel loss, and ~41X lower power w.r.t. traditional Galvanic body channel communication (G-BCC) at a carrier frequency of 1MHz (~6X lower power than G-BCC at 10MHz) by blocking DC current paths through the brain tissue. An additional 16X improvement in net energy-efficiency (pJ/b) is achieved through compressive sensing (CS), allowing a scalable (6kbps-10Mbps) duty-cycled uplink (UL) from the implant to an external wearable, while reducing the active power consumption to 0.52μW at 10Mbps, i.e. within the range of harvested body-coupled power in the downlink (DL), with externally applied electric currents < 1/5th of ICNIRP safety limits. BP-QBC eliminates the need for sub-cranial interrogators, utilizing quasi-static electrical signals for end-to-end BCC, avoiding transduction losses.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129154158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 5nm Fin-FET 2G-search/s 512-entry x 220-bit TCAM with Single Cycle Entry Update Capability for Data Center ASICs","authors":"Chetan Deshpande, Ritesh Garg, Gajanan Jedhe, Gaurang Narvekar, Sushil Kumar","doi":"10.23919/VLSICircuits52068.2021.9492464","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492464","url":null,"abstract":"This paper presents a 2G-search/s embedded Ternary Content Addressable Memory (TCAM) design in 5nm Fin-FET technology with the ability to update both SRAM words in a TCAM entry in a single clock cycle. This reduces TCAM update latency by 50% for data center Application Specific Integrated Circuits (ASICs) with only 1% area overhead and no search power penalty. We present a novel time multiplexed input bus interface on a single port TCAM cell array and new architecture to enable fast updates. Silicon measurement shows the highest reported search rate of 2G-search/s at a 3.48Mb/mm2 memory density including all global peripheral circuitry for a 512 entry, 220-bit wide, 110Kb TCAM.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126911525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiping Wang, M. Megahed, Yusang Chun, Tejasvi Anand
{"title":"A Machine Learning Inspired Transceiver with ISI-Resilient Data Encoding: Hybrid-Ternary Coding + 2-Tap FFE + CTLE + Feature Extraction and Classification for 44.7dB Channel Loss in 7.3pJ/bit","authors":"Zhiping Wang, M. Megahed, Yusang Chun, Tejasvi Anand","doi":"10.23919/VLSICircuits52068.2021.9492510","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492510","url":null,"abstract":"This paper presents a machine learning inspired energy-efficient transceiver targeting long-reach channels using an ISI-resilient hybrid-ternary encoding on the transmitter and feature extraction and classification on the receiver. In addition to data encoding, the proposed transceiver also employs a 2-tap FFE and CTLE to achieve communication on a 44.7dB loss FR4 channel with BER less than 1×10-6, and an energy efficiency of 7.3pJ/bit at 13.8Gb/s in 65nm CMOS.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133940121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and Technology Solutions for 3D Integrated High Performance Systems","authors":"G. V. D. Plas, E. Beyne","doi":"10.23919/VLSICircuits52068.2021.9492421","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492421","url":null,"abstract":"3D system integration builds on interconnect scaling roadmaps of TSVs (5µm to 100nm CD) and fine pitch bumps/pads (to <1µm pitch) for D2W and W2W schemes. Si bridges connect chiplets at 9.5Gbp, 338fJ/b, while W2W fine pitch memory logic functional partitioning improves power/performance by 30% vs 2D. Impingement cooler, BSPDN, high density MIMCAP and integrated magnetics push the power wall to 300W/cm2. On the other hand, 3D design flows require further development. Process optimization, DfT, KGD/S and heterogeneous technology optimization of functionally partitioned 3D-SOC make high performance systems cost-effective.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131582738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fugaku and A64FX: the First Exascale Supercomputer and its Innovative Arm CPU","authors":"S. Matsuoka","doi":"10.23919/VLSICircuits52068.2021.9492415","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492415","url":null,"abstract":"Fugaku is the first exascale supercomputer in the world, designed and built primarily by Riken Center for Computational Science (R-CCS) and Fujitsu Ltd., but involving essentially all the major stakeholders in the Japanese HPC community. The name ‘Fugaku’ is an alternative name for Mt. Fuji, and was chosen to signify that the machine not only seeks very high performance, but also a broad base of users and applicability at the same time. The heart of Fugaku is the new Fujitsu A64FX Arm processor, which is 100% compliant to Aarch64 specifications, yet embodies technologies realized for the first time in a major server general-purpose CPU, such as 7nm process technology, on-package integrated HBM2 and terabyte-class SVE streaming capabilities, on-die embedded TOFU-D high-performance network including the network switch, and adoption of so-called ‘disaggregated architecture’ that allows separation and arbitrary combination of CPU core, memory, and network functions. Fugaku uses 158,974 A64FX CPUs in a single socket node configuration, making it the largest and fastest supercomputer ever created, signified by its groundbreaking achievements in major HPC benchmarks, as well as producing societal results in COVID-19 applications.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132813703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}