2021 IEEE Asian Solid-State Circuits Conference (A-SSCC)最新文献_第9页

A Hybrid Continuous Time Incremental and SAR Two-Step ADC with 90.5dB DR over 1MHz BW 一种混合连续时间增量和SAR两步ADC, DR为90.5dB, BW为1MHz

2021 IEEE Asian Solid-State Circuits Conference (A-SSCC) Pub Date : 2021-11-07 DOI: 10.1109/A-SSCC53895.2021.9634732

Yanchao Wang, Siladitya Dey, Tao He, Lukang Shi, Jiawei Zheng, Manjunath Kareppagoudr, Yi Zhang, Kazuki Sobue, K. Hamashita, K. Tomioka, G. Temes

引用次数: 5

A Sort-Less FPGA-Based Non-Maximum Suppression Accelerator using Multi-Thread Computing and Binary Max Engine for Object Detection 基于多线程计算和二进制最大引擎的无排序fpga非最大抑制加速器的目标检测

2021 IEEE Asian Solid-State Circuits Conference (A-SSCC) Pub Date : 2021-11-07 DOI: 10.1109/A-SSCC53895.2021.9634708

Chaoming Fang, Habib Derbyshire, Wenyu Sun, Jinshan Yue, Haobing Shi, Yongpan Liu

{"title":"A Sort-Less FPGA-Based Non-Maximum Suppression Accelerator using Multi-Thread Computing and Binary Max Engine for Object Detection","authors":"Chaoming Fang, Habib Derbyshire, Wenyu Sun, Jinshan Yue, Haobing Shi, Yongpan Liu","doi":"10.1109/A-SSCC53895.2021.9634708","DOIUrl":"https://doi.org/10.1109/A-SSCC53895.2021.9634708","url":null,"abstract":"Non-Maximum Suppression (NMS) algorithm is an important post-processing step in object detection networks for various applications [1]. Standard NMS procedure suffers from poor time complexity and large power consumption due to its iterative and greedy search procedure, making it a bottleneck for object detection networks implemented on various processors [2], [3]. Previous NMS accelerators achieved optimization by stacking arithmetic logical units or computing consecutive iterations simultaneously [4] –[6]. However, several challenges exist, as shown in Fig. 1. First, the highly iterative process of NMS will either cause a high time or space complexity if the hardware resources are not designed properly. Second, the standard NMS process requires sorting of the bounding boxes by the score, and such sorting circuits occupy abundant resources and produce massive data movements. Finally, the Intersection Over Union (IOU) calculation requires hardware unfriendly operations like multiplication and division, taking up loads of valuable hardware resources such as DSPs.","PeriodicalId":286139,"journal":{"name":"2021 IEEE Asian Solid-State Circuits Conference (A-SSCC)","volume":"19 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128396177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A Feedback Architecture of High Speed True Random Number Generator based on Ring Oscillator 一种基于环形振荡器的高速真随机数发生器反馈结构

2021 IEEE Asian Solid-State Circuits Conference (A-SSCC) Pub Date : 2021-11-07 DOI: 10.1109/A-SSCC53895.2021.9634760

Xin Cheng, Haowen Zhu, Xinyi Xing, Yunfeng Zhang, Yongqiang Zhang, Guangjun Xie, Zhang Zhang

{"title":"A Feedback Architecture of High Speed True Random Number Generator based on Ring Oscillator","authors":"Xin Cheng, Haowen Zhu, Xinyi Xing, Yunfeng Zhang, Yongqiang Zhang, Guangjun Xie, Zhang Zhang","doi":"10.1109/A-SSCC53895.2021.9634760","DOIUrl":"https://doi.org/10.1109/A-SSCC53895.2021.9634760","url":null,"abstract":"True random number generators (TRNG) are widely used to generate encryption keys in information security systems [1]–[2]. In TRNG, entropy source is a critical module who provides the source of randomness of output bit stream. The unavoidable electrical noise in circuit becomes an ideal entropy source due to its unpredictability. Among the methods of capturing electrical noise, ring oscillator-based entropy source makes the TRNG most robust to deterministic noise and 1/f noise which means the strongest anti-interference capability, so it is simple in structure and easy to integrate [3]. Thus, great research attention has focused on ring oscillator-based TRNGs [3] –[7]. In [4], a high-speed TRNG with 100Mbps output bit rate was proposed, but it took up too much power and area. A TRNG based on tetrahedral ring oscillator was proposed in [5]. Its power consumption was very low but the output bit rate was also very low. A ring oscillator-based TRNG with low output bit rate but high power was proposed in [7]. In a word, none of the above architectures achieve an appropriate compromise between bit rate and power consumption. This work presents a new feedback architecture of TRNG based on tetrahedral ring oscillator. The output random bit stream generates a relative random control voltage that acts on the transmission gates in oscillator through a feedback loop, thus increasing phase jitter of the oscillator and improving output bit rate. Furthermore, an XOR chain-based post-processing unit is added to eliminate the statistical deviations and correlations between raw bits.","PeriodicalId":286139,"journal":{"name":"2021 IEEE Asian Solid-State Circuits Conference (A-SSCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125873101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

An 8.7 μJ/class. FFT accelerator and DNN-based configurable SoC for Multi-Class Chronic Neurological Disorder Detection 8.7 μJ/类。FFT加速器和基于dnn的可配置SoC用于多类别慢性神经系统疾病检测

2021 IEEE Asian Solid-State Circuits Conference (A-SSCC) Pub Date : 2021-11-07 DOI: 10.1109/A-SSCC53895.2021.9634763

Zain Taufique, Bingzhao Zhu, G. Coppola, Mahsa Shoaran, Wala Saadeh, Muhammad Awais Bin Altaf

引用次数: 7

A 4.2mW 4K 6-8GHz CMOS LNA for Superconducting Qubit Readout 用于超导量子比特读出的4.2mW 4K 6-8GHz CMOS LNA

2021 IEEE Asian Solid-State Circuits Conference (A-SSCC) Pub Date : 2021-11-07 DOI: 10.1109/A-SSCC53895.2021.9634832

Alican Çağlar, S. V. Winckel, S. Brebels, P. Wambacq, J. Craninckx

{"title":"A 4.2mW 4K 6-8GHz CMOS LNA for Superconducting Qubit Readout","authors":"Alican Çağlar, S. V. Winckel, S. Brebels, P. Wambacq, J. Craninckx","doi":"10.1109/A-SSCC53895.2021.9634832","DOIUrl":"https://doi.org/10.1109/A-SSCC53895.2021.9634832","url":null,"abstract":"Millions of qubits need to be employed in a quantum computer to achieve a fault-tolerant quantum operation. To reduce the complexity in such a large-scale system, the control and readout circuitries have been proposed to be placed at the 4 K stage of dilution refrigerators [1]. CMOS technology is commonly used to leverage its scaling to enable large integration of control and readout circuitries with qubits. However, the high-fidelity readout operations require low noise amplifiers (LNAs) with a noise temperature of a few Kelvins. This necessitates the usage of HEMT and parametric amplifiers [2]. Recently reported CMOS LNAs are still far away from attaining such good performance [3–5]. Thus, this is one of the greatest challenges on the way to the fully integrated CMOS readout. Additionally, due to the limited cooling power of dilution refrigerators, low-power solutions are needed for achieving a very good noise performance at 4 K. This paper presents a 28 nm CMOS LNA for qubit readout, which achieves an order of magnitude power reduction compared to its CMOS counterparts while still providing a similar good noise Figure (NF) performance at 4 K.","PeriodicalId":286139,"journal":{"name":"2021 IEEE Asian Solid-State Circuits Conference (A-SSCC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122558331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

An Arithmetic Progression Switched-Capacitor DC-DC Converter with Soft VCR Transitions Achieving 93.7% Peak Efficiency and 400 mA Output Current 具有软VCR转换的算术级数开关电容DC-DC变换器，峰值效率为93.7%，输出电流为400 mA

2021 IEEE Asian Solid-State Circuits Conference (A-SSCC) Pub Date : 2021-11-07 DOI: 10.1109/A-SSCC53895.2021.9634798

Yang Jiang, M. Law, Pui-in Mak, R. Martins

{"title":"An Arithmetic Progression Switched-Capacitor DC-DC Converter with Soft VCR Transitions Achieving 93.7% Peak Efficiency and 400 mA Output Current","authors":"Yang Jiang, M. Law, Pui-in Mak, R. Martins","doi":"10.1109/A-SSCC53895.2021.9634798","DOIUrl":"https://doi.org/10.1109/A-SSCC53895.2021.9634798","url":null,"abstract":"Dynamic source adaptation and supply modulation can benefit the power efficiency and system functionality of energy-harvesting interfaces, voltage-scalable SoCs, device drivers, power amplifiers, and others. A switched-capacitor (SC) DC-DC converter can achieve high power conversion efficiency (PCE) and power density at the hundreds-of-mW. Several reconfigurable SC topologies emerged to generate multiple voltage conversion ratios (VCRs) systematically with lower conduction and parasitic losses in steady state [1]–[4]. However, during VCR transitions, the voltage imbalance among the flying capacitors (CFLY) can induce charge redistribution loss. This hard-VCR-transition loss inevitably hurts the overall efficiency and remains unresolved. This work proposes an arithmetic progression (AP) SC DC-DC converter topology for systematic rational VCR generation while featuring soft VCR transitions. It demonstrates fixed voltages with each CFLY irrespective of VCR change to eliminate the CFLY voltage rebalance effect. The proposed AP topology also achieves theoretical optimum in terms of the steady-state slow-/fast switching-limited losses. Due to the inherent property of two-phase quasi-symmetric output charge (QOUT) delivery, it ensures a low output ripple without using a conventional dual-branch converter architecture. We further propose a cross-coupled bootstrapping (XCBS) gate driver, operating at half of switching frequency (fSW/2), to control the flying power switches adaptively. Realizing step-down VCRs of 5:4/3/2/1, the proposed AP converter reaches a measured peak PCE of 93.7% and a maximum output current of 400 mA. Featuring soft VCR transitions, it demonstrates an average PCE of up to 89% under a periodic VCR transition (fVCR_tran) at 100 kHz.","PeriodicalId":286139,"journal":{"name":"2021 IEEE Asian Solid-State Circuits Conference (A-SSCC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122649445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A 24-30GHz 4-Element Phased Array Transceiver with Low Insertion Loss Compact T/R Switch and Bidirectional Phase Shifter in 65 nm CMOS Technology 一种采用65纳米CMOS技术、具有低插入损耗、紧凑T/R开关和双向移相器的24-30GHz 4元相控阵收发器

2021 IEEE Asian Solid-State Circuits Conference (A-SSCC) Pub Date : 2021-11-07 DOI: 10.1109/A-SSCC53895.2021.9634813

Xiangrong Huang, Haikun Jia, Shengnan Dong, W. Deng, Zhihua Wang, B. Chi

{"title":"A 24-30GHz 4-Element Phased Array Transceiver with Low Insertion Loss Compact T/R Switch and Bidirectional Phase Shifter in 65 nm CMOS Technology","authors":"Xiangrong Huang, Haikun Jia, Shengnan Dong, W. Deng, Zhihua Wang, B. Chi","doi":"10.1109/A-SSCC53895.2021.9634813","DOIUrl":"https://doi.org/10.1109/A-SSCC53895.2021.9634813","url":null,"abstract":"The 5G technology greatly expands the field of mobile communication by its high data rate and low latency. The performance improvement of 5G necessities a variety of different technologies including phased array technique. A T/R switch is widely used in phased arrays to reduce the number of antennas. It is one of the most critical modules in the phased array since it influences the performance of the output power for TX and noise Figure (NF) for RX directly. Conventional T/R switches are based on $lambda /4$ transmission line, which is area consuming in lower millimeter-wave frequency range [1]. Recently, lumped equivalent transmission lines based on inductors [2] and transformers [3] are employed to reduce the chip area. However, the lumped transmission lines are usually narrow-band. Furthermore, the inductors or transformers still occupy extra chip area. To address this issue, a compact T/R switch co-designed with PA’s output match network and LNA’s input match network is proposed. Leveraging the existing transformers in the two match-networks, only one extra transistor switch is needed in this compact T/R switch, which greatly reduces the chip area consumption and therefore the insertion loss.","PeriodicalId":286139,"journal":{"name":"2021 IEEE Asian Solid-State Circuits Conference (A-SSCC)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117304741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A 389TOPS/W, 1262fps at 1Meps Region Proposal Integrated Circuit for Neuromorphic Vision Sensors in 65nm CMOS 基于65nm CMOS的389TOPS/W、1262fps、1Meps区域建议集成电路

2021 IEEE Asian Solid-State Circuits Conference (A-SSCC) Pub Date : 2021-11-07 DOI: 10.1109/A-SSCC53895.2021.9634734

S. Bose, A. Basu

{"title":"A 389TOPS/W, 1262fps at 1Meps Region Proposal Integrated Circuit for Neuromorphic Vision Sensors in 65nm CMOS","authors":"S. Bose, A. Basu","doi":"10.1109/A-SSCC53895.2021.9634734","DOIUrl":"https://doi.org/10.1109/A-SSCC53895.2021.9634734","url":null,"abstract":"Neuromorphic vision sensors (NVS) [1] are key enablers in traffic monitoring and surveillance systems that exploit the temporal redundancy in video streams to get $gt 2mathrm{X}$ energy savings by blank frame detection (Fig. 1). Such concept of event driven processing has been used to reduce system energy for regular cameras as well [2]. However, an object typically occupies a fraction of the full image frame (Fig. 1) leading to a significant spatial redundancy in the image. Hence, an energy-efficient hardware is required to detect the region of interests (RoIs) in the valid frames to trigger an object recognition engine only for the RoIs. For a binary image, the region proposal can be performed by the connected component labeling (CCL) algorithm [2]. However, CCL scans the image in a raster fashion to calculate the ROIs leading to longer execution time and higher energy dissipation due to enormous data transfer. On the contrary, emerging in-memory [3], [4] and near-memory [5] computing approaches are a way to eliminate the data transfer cost and latency, promising further energy savings. In this paper, we propose 9T-SRAM based near and in-memory computing region proposal integrated circuit (RPIC) leveraging the $1 -mathrm{D}$ projections of the objects on the vertical and horizontal axes. Further, we propose an iterative and selective search (ISS) algorithm to overcome overlapped projections among objects and provide an accurate number of objects and their exacts coordinates.","PeriodicalId":286139,"journal":{"name":"2021 IEEE Asian Solid-State Circuits Conference (A-SSCC)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124632772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

An Efficient and Reliable Negative Margin Timing Error Detection for Neural Network Accelerator without Accuracy Loss in 28nm CMOS 一种高效可靠且无精度损失的神经网络加速器负余量时序误差检测方法

2021 IEEE Asian Solid-State Circuits Conference (A-SSCC) Pub Date : 2021-11-07 DOI: 10.1109/A-SSCC53895.2021.9634809

Ziyu Li, Weiwei Shan, Chengjun Wu, Haitao Ge, Jun Yang

{"title":"An Efficient and Reliable Negative Margin Timing Error Detection for Neural Network Accelerator without Accuracy Loss in 28nm CMOS","authors":"Ziyu Li, Weiwei Shan, Chengjun Wu, Haitao Ge, Jun Yang","doi":"10.1109/A-SSCC53895.2021.9634809","DOIUrl":"https://doi.org/10.1109/A-SSCC53895.2021.9634809","url":null,"abstract":"Energy-efficient neural network (NN) accelerators are essential for IoT and mobile applications, where PVT variations become severe especially in near-threshold voltage (NTV) range. Recent work [1]–[4] applied error detection and correction (EDAC) based adaptive voltage frequency scaling (AVFS) on NN accelerators to eliminate the excess timing margins while decreasing power supply until detecting timing violations (Fig. 1). By using the fault tolerance of NN accelerators to avoid the error correction, they increased energy efficiency a lot. However, NN has limited tolerance to timing errors since a little timing errors will cause serious loss of accuracy, for example, up to 3% accuracy loss in MNIST [2]. Body swapping and adaptive clock techniques have also been adopted to reduce the accuracy loss [3– 4]. Traditional AVFS system monitors the most critical paths and then decreases the voltage until reaching point of first failure (PoFF). However, NN accelerator’s critical paths have distinct characteristics from conventional circuits that makes common EDAC not efficient and risky when applied in NN.","PeriodicalId":286139,"journal":{"name":"2021 IEEE Asian Solid-State Circuits Conference (A-SSCC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124606964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A 4.39ps, 1.5GS/s Time–to-Digital Converter with 4× Phase Interpolation Technique and a 2-D Quantization Array 一个4.39ps, 1.5GS/s的时间-数字转换器，采用4×相位插值技术和二维量化阵列

2021 IEEE Asian Solid-State Circuits Conference (A-SSCC) Pub Date : 2021-11-07 DOI: 10.1109/A-SSCC53895.2021.9634753

Yongkuo Ma, Peiyuan Wan, Hongda Zhang, Zhi Wan, Xiaoyu Zhang, Xu Liu, Zhijie Chen

{"title":"A 4.39ps, 1.5GS/s Time–to-Digital Converter with 4× Phase Interpolation Technique and a 2-D Quantization Array","authors":"Yongkuo Ma, Peiyuan Wan, Hongda Zhang, Zhi Wan, Xiaoyu Zhang, Xu Liu, Zhijie Chen","doi":"10.1109/A-SSCC53895.2021.9634753","DOIUrl":"https://doi.org/10.1109/A-SSCC53895.2021.9634753","url":null,"abstract":"With the shrinking supply voltages and scaling process, time-based circuit is becoming more attractive in ultra-deep submicron mixed-signal circuit design compared with the traditional voltage-domain circuits. A time-to-digital converter (TDC) is the key component in time-based circuits, which is used to quantize the time interval between two rising edges, Start and Stop signal. The TDCs are widely used in frequency generation (digital phase-locked loop) [1], data conversion (time-based ADC) [2] and energy-efficient neural network acceleration. The most elementary TDC is the delay-line TDC, which is also the essential component of other TDCs [3] [4], having the merit of simple-structure and low-power. However, the limitation of the minimum intrinsic delay of a single delay-element makes the realization of a high-resolution delay-line TDC difficult. Moreover, its area and power consumption increase exponentially with the quantization bits, while conversion speed is opposite. To resolve this problem, this paper proposed a novel Phase Interpolation time-to-digital converter (PI-TDC) with 2-dimensional quantization array and multiplex delay line technique.","PeriodicalId":286139,"journal":{"name":"2021 IEEE Asian Solid-State Circuits Conference (A-SSCC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116041040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0