{"title":"Design of Wideband Phase Modulator for 2.4~5.25 GHz Digital Polar Transmitter","authors":"Haoliang Zhu, Zhiqun Li, Zhennan Li, Yan Yao","doi":"10.1109/ASICON52560.2021.9620405","DOIUrl":"https://doi.org/10.1109/ASICON52560.2021.9620405","url":null,"abstract":"This paper presents a wideband phase modulator for 2.4~5.25GHz digital polar transmitter implemented in 22 nm CMOS process. The phase modulator is an open-loop phase modulation architecture for wide bandwidth, and modulates the output phase according to the principle of vector-sum. The core circuit of the phase modulator is a high-resolution I/Q phase interpolator (PI), which achieves phase modulation by the weighted summation of in-phase (I) and quadrature (Q) signals. Differential-current DACs are used to implement the I/Q weighting. The simulation results show that in the range of 2.4~5.25GHz, the power consumption is 11mW, the phase resolution is 0.7°, and the operating temperature covers -40~85℃.","PeriodicalId":233584,"journal":{"name":"2021 IEEE 14th International Conference on ASIC (ASICON)","volume":"64 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121302734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haihua Wang, Song He, K. Xiao, Yu-Long Jiang, Jing Wan
{"title":"Impact of Evaporated AuNP Thickness on Pseudo-MOS and Its Application in Direct MicroRNA-375 Detection","authors":"Haihua Wang, Song He, K. Xiao, Yu-Long Jiang, Jing Wan","doi":"10.1109/ASICON52560.2021.9620235","DOIUrl":"https://doi.org/10.1109/ASICON52560.2021.9620235","url":null,"abstract":"Different from traditional surface functionalization using organosilane coupling agents for probe DNA immobilization, this work develops the evaporated gold nanoparticle (AuNP) as the linker on silicon-on-insulator pseudo-MOS transistor. The electrical impact of AuNP thickness on pseudo-MOS is systematically studied by measuring the transfer characteristic curves. The AuNP/pseudo-MOS device is applied to immobilize probe DNA, and a label-free and direct detection of target microRNA-375 is achieved.","PeriodicalId":233584,"journal":{"name":"2021 IEEE 14th International Conference on ASIC (ASICON)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128425018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploiting Dynamic Bit Sparsity in Activation for Deep Neural Network Acceleration","authors":"Yongshuai Sun, Mengyuan Guo, Dacheng Liang, Shan Tang, Naifeng Jing","doi":"10.1109/ASICON52560.2021.9620448","DOIUrl":"https://doi.org/10.1109/ASICON52560.2021.9620448","url":null,"abstract":"Data sparsity is important in accelerating deep neural networks (DNNs). However, besides the zeroed values, the bit sparsity especially in activations are oftentimes missing in conventional DNN accelerators. In this paper, we present a DNN accelerator to exploit the bit sparsity by dynamically skipping zeroed bits in activations. To this goal, we first substitute the multiply-and-accumulate (MAC) units with more serial shift-and-accumulate units to sustain the computing parallelism. To prevent the low efficiency caused by the random number and positions of the zeroed bits in different activations, we propose activation-grouping, so that the activations in the same group can be computed on non-zero bits in different channels freely, and synchronization is only needed between different groups. We implement the proposed accelerator with 16 process units (PU) and 16 processing elements (PE) in each PU on FPGA built upon VTA (Versatile Tensor Accelerator) which can integrate seamlessly with TVM compilation. We evaluate the efficiency of our design with convolutional layers in resnet18 respectively, which achieves over 3.2x speedup on average compared with VTA design. In terms of the whole network, it can achieve over 2.26x speedup and over 2.0x improvement on area efficiency.","PeriodicalId":233584,"journal":{"name":"2021 IEEE 14th International Conference on ASIC (ASICON)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115375058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementation of a CRNN-based low-power keyword recognition system on FPGA","authors":"Limo Guo, PengXu Lin, Lei Guo, Bo Liu","doi":"10.1109/ASICON52560.2021.9620311","DOIUrl":"https://doi.org/10.1109/ASICON52560.2021.9620311","url":null,"abstract":"A low-power and high-precision reconfigurable processor based on optimized convolutional recurrent neural network is proposed for noise robust keyword recognition. In order to create a low-power and high-precision system, we implemented a reconfigurable CRNN and quantization network on FPGA, which greatly reduced the use of DSP, BRAM, LUT and other resources. Our system can identify some keywords, such as \"yes\", \"no\", \"down\" and \"up\" within 50ms, and at a signal-to-noise ratio of-5dB, the actual accuracy reaches 86.4%.","PeriodicalId":233584,"journal":{"name":"2021 IEEE 14th International Conference on ASIC (ASICON)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114470040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A CMOS Time-of-Flight Image Sensor with High Dynamic Range Digital Pixel","authors":"Shanzhe Yu, Yacong Zhang, Fei Zhou, Wengao Lu, Shuyu Lei, Zhongjian Chen","doi":"10.1109/ASICON52560.2021.9620349","DOIUrl":"https://doi.org/10.1109/ASICON52560.2021.9620349","url":null,"abstract":"To widen measuring range and suppress background light, a CMOS time-of-flight image sensor with high dynamic range digital pixel is proposed. The sensing charge is quantized by extended-counting analogue-to-digital converter (ADC) which consists of pixel coarse quantization and column fine quantization. To maximize dynamic range in limited pixel area, the coarse quantization circuit is shared by 2×2 pixels, in which a novel up/down counter with a small number of transistors is proposed. Based on the pixel circuit, a 32×32 prototype TOF imager with 12μm-pitch digital pixel is designed in 0.11μm 1P4M CMOS image sensor technology. Simulation results show that a 104.6dB dynamic range of this ADC for wide measuring range is achieved at a frame rate of 50Hz.","PeriodicalId":233584,"journal":{"name":"2021 IEEE 14th International Conference on ASIC (ASICON)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114215950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An 4th-order N-path Bandpass Filter with a Tuning Range of 1-30 GHz and OOB Rejection > 30 dB in 28 nm CMOS","authors":"Xi Wang, Junyan Ren, Shunli Ma","doi":"10.1109/ASICON52560.2021.9620337","DOIUrl":"https://doi.org/10.1109/ASICON52560.2021.9620337","url":null,"abstract":"A 4th-order N-path bandpass filter technique with a tuning range of 1-30 GHz and a constant 3-dB bandwidth of 500 MHz is analyzed. To meet the demand for multi-mode and multi-band RF receiver chips, traditional off-chip and non-tunable filters such as SAW and BAW filters should be replaced by N-path filter due to its wide tuning range of center frequency, high Q and small noise figure. The proposed filter consists of four switches (mixers) to convert the frequency band of input signals and a sampling clock circuit which operates at 2-60 GHz. The tuning range of conventional N-path filters is limited by the bandwidth of its clock divider and the overlap of the four clocks with fixed phase difference. To overcome these shortages, dividers and NAND gates based on current mode logic(CML) structure are proposed. In this work, the out of band (OOB) rejection is greater than 30 dB and the simulated out-of-band IIP3 is 5.1 dBm at 25 GHz. To the best of the authors’ knowledge, I this is the first time for N-path filter to work at millimeter wave frequency band.","PeriodicalId":233584,"journal":{"name":"2021 IEEE 14th International Conference on ASIC (ASICON)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127207564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Streaming Feature Extraction Accelerator using DPCM Image Compression Technique for SLAM Applications","authors":"Zhiyuan Wang, Zhuo Zhang, Haowen Chen","doi":"10.1109/ASICON52560.2021.9620342","DOIUrl":"https://doi.org/10.1109/ASICON52560.2021.9620342","url":null,"abstract":"The extraction of feature points plays a significant role in simultaneous localization and mapping (SLAM) applications. However, in the streaming architecture of the feature extraction, sizable row buffers are required to store data, usually occupying a large proportion of the hardware area. To ameliorate this problem, in this paper, we propose a streaming feature extraction architecture with narrower row buffers, combined with the differential pulse-code modulation (DPCM) image compression technique. Meanwhile, we improve the data flow to omit the compressions and decompressions in the critical data path by employing a novel strategy of transposing DPCM decompression and linear operation (TDDLO). Moreover, the calculations are further simplified by introducing an approximate algorithm of the rotation calculation. Consequently, the hardware costs are notably saved, while the impact of DPCM compression on power, latency, and accuracy is mitigated. The experimental results reveal at least a 32% reduction in memory compared with state-of-the-art architectures. Simulated by TSMC 28nm CMOS technology, the proposed architecture can process full-HD (1920×1080) images at 241 fps and consume only 52.7 mW power, while the normalized absolute trajectory error increases slightly by 0.2% on the TUM dataset.","PeriodicalId":233584,"journal":{"name":"2021 IEEE 14th International Conference on ASIC (ASICON)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125570478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analytical Global Placement for Heterogenous FPGAs Based on the eDensity Model","authors":"Huimin Wang, Xingyu Tong, Runming Shi, Sifei Wang, Jun Yu, Jianli Chen","doi":"10.1109/ASICON52560.2021.9620442","DOIUrl":"https://doi.org/10.1109/ASICON52560.2021.9620442","url":null,"abstract":"Recent years have seen increased research attention given towards the global placement problem due to the growing capability and heterogeneity of FPGAs. Designed specially for heterogeneous FPGAs, a novel analytical algorithm for global placement problem is proposed and introduced in this paper. On the basis of the eDensity model , our well-proven algorithm aims to get a high-quality solution without efficiency loss. Besides, a fence region processing strategy is implemented to satisfy the heterogeneity constraints. To make the placement solution more compact and thus optimize the total wirelength, we inject appropriate doses of redundant eDensity charges onto instances to be placed. Furthermore, a repulsive force generation technology is adopted to prevent cells from entering the unplaceable regions. We use the nonlinear optimizer to solve our heterogenous objective function. Experimental results on modern industry benchmarks show that our proposed algorithm achieves 8.16% wirelength reduction and 38.89% runtime acceleration on average compared with the commercial tool Procise™.","PeriodicalId":233584,"journal":{"name":"2021 IEEE 14th International Conference on ASIC (ASICON)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126159516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Highly Efficient Modulo Loop Pipeline For High Level Synthesis","authors":"Chang Wu, Jundong Xie, Kexin Wang","doi":"10.1109/ASICON52560.2021.9620276","DOIUrl":"https://doi.org/10.1109/ASICON52560.2021.9620276","url":null,"abstract":"State-of-the-art loop pipeline algorithms use iterative SDC scheduling to compute a best Initiation Interval (II). However, the time complexity of SDC scheduling itself is O(n2(m + nlogn)logn) for a Control and Data Flow Graph (CDFG) with n nodes and m constraints. This can be very high for large loops. In this paper, we propose a linear time scheduling algorithm for loop pipeline without back-tracking. Our test results show that our algorithm can be over 1000x faster than the iterative SDC-based algorithm in LegUp, while achieving the same II. When compared with the industrial tool VivadoHLS, our algorithm can still be over 500x faster, on average, with comparable quality of results.","PeriodicalId":233584,"journal":{"name":"2021 IEEE 14th International Conference on ASIC (ASICON)","volume":"21 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121008414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Sparsity Preserving Model Order Reduction Algorithm for Multi-terminal RC Networks","authors":"Xin Chen, Lin Pan, Yangxin Xiang","doi":"10.1109/ASICON52560.2021.9620477","DOIUrl":"https://doi.org/10.1109/ASICON52560.2021.9620477","url":null,"abstract":"VLSI post-layout parasitic analysis demands more in fast simulation methods of huge and multi-terminal networks. Model order reduction (MOR) can settle for it by smaller model with approximating response at terminals, but destruction of system sparsity can slow down simulation speed a lot. To preserve sparsity, we introduce incomplete LU decomposition and pre-processing procedure into projection-based reduction methods. Experimental results show that the circuit simulation speed improves about 3X-11X with RMS error lower than 2e-3.","PeriodicalId":233584,"journal":{"name":"2021 IEEE 14th International Conference on ASIC (ASICON)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128150110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}