{"title":"An IMPLY-based Memristive Multiplier for Computing-in-Memory Systems with Weight-Stationary CNN Acceleration","authors":"Wenhui Liang, Jiarui Xu, Yuansheng Zhao, Zixuan Shen, Guoyi Yu, Yuhui He, Chao Wang","doi":"10.1109/ICTA56932.2022.9962994","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9962994","url":null,"abstract":"Adders and multipliers based on memristive Material Implication (IMPLY) logic are widely used in primary building blocks of Arithmetic Logic Unit (ALU). To solve the issue that the existing IMPLY-based multipliers cannot protect the input operands, this paper presents a novel data non-destructive memristive IMPLY-based semi-parallel multiplier for Computing-in-Memory (CIM) systems, by assigning function-specific memristors for data-protection and introducing additional switches for higher parallelism. Simulation results show that the proposed multiplier can achieve 30% faster than conventional semi-parallel design and 9.1 % less memristors against the state-of-art semi-serial design for 4-bit multiplication, while preventing the input weight from destruction as required by CNN weight reuse.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125547469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementation of CNN Heterogeneous Scheme Based on Domestic FPGA with RISC-V Soft Core CPU","authors":"Hailong Wu, Jindong Li, Xiang Chen","doi":"10.1109/ICTA56932.2022.9963056","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963056","url":null,"abstract":"Field Programmable Gate Array (FPGA) has the characteristics of low power consumption, high performance and flexibility. Research on FPGA neural network acceleration is emerging, but most of the researches are based on foreign FPGA devices. In order to improve the current situation of domestic FPGA, a novel Convolutional neural networks (CNNs) accelerator for domestic FPGA equipped with lightweight RISC-V soft core is proposed. The peak performance of the proposed accelerator reaches 153.6 GOP/s, occupying only 14K LUTs (Look-Up-Table), 32 DRMs (Dedicated RAM Modules) and 208 APMs (Arithmetic Process Modules). The proposed accelerator has enough computing power for most of the Edge-AI applications and embedded systems, providing a possible AI inference acceleration solution for domestic FPGA.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129450555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CVD Monolayer tungsten-based PMOS Transistor with high performance at Vds = -1 V","authors":"Xin Wang, Yanqing Wu","doi":"10.1109/ICTA56932.2022.9963068","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963068","url":null,"abstract":"Two-dimensional (2D) semiconducting materials channels enable ultimate scaling of transistors and will help Moore's Law Scaling for decades. In this paper, we reported p-type WSe2transistors using monolayer (¬0.85 nm) channels by molten-salt-assisted chemical vapor deposition. The transfer-free back-gate devices fabricated based on 100 nm SiO2/Si substrate exhibit highest on current at Vds= -1 V among transistors of monolayer p-WSe2, and a high on/off ratio up to 108.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114376919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haoqing Xu, Weizhuo Gan, Lei Cao, H. Yin, Zhenhua Wu
{"title":"Prediction of Key Metrics of Stacked Nanosheet nFETs using Genetic Algorithm-based Neural Networks","authors":"Haoqing Xu, Weizhuo Gan, Lei Cao, H. Yin, Zhenhua Wu","doi":"10.1109/ICTA56932.2022.9963088","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963088","url":null,"abstract":"In this paper, we demonstrate the prediction of important figures of merit (FoMs) including threshold voltage (Vth), subthreshold swing (SS), on-state (Ion) and off-state (Ioft) current, of vertically stacked lateral nanosheet field-effect-transistors (NSFET) using 1) an artificial neural network generated by genetic algorithm (GA) and 2) a conventional multi-layer neural network (NN). Our work shows that the trained GA-based NN has a great capability of predicting FoMs with an average of coefficients of determination at 0.992, which is better than that of the trained multi-layer neural network at 0.987. Additionally, GA-based NN has a significant reduction of calculation time by 80% compared with that of multi-layer NN under the same computing power, which indicates the possibility to reduce the computational cost by using the auto-machine learning approach for TCAD simulation.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131358855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yunqiang Yang, Ming Zhong, Qianli Ma, Ziyi Lin, Leliang Li, Guike Li, Liyuan Liu, Jian Liu, N. Wu, Haikun Jia, Xinghui Liu, Nan Qi
{"title":"A 56Gb/s De-serializer with PAM-4 CDR for Chiplet Optical-I/O","authors":"Yunqiang Yang, Ming Zhong, Qianli Ma, Ziyi Lin, Leliang Li, Guike Li, Liyuan Liu, Jian Liu, N. Wu, Haikun Jia, Xinghui Liu, Nan Qi","doi":"10.1109/ICTA56932.2022.9963101","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963101","url":null,"abstract":"This paper presents a 56Gb/s de-serializer with PAM-4 CDR for chiplet optical-I/O in 28nm CMOS. There are two channels in this chip. Each channel consists of a high-performance analog front end (AFE) and a half-rate clock and data recovery (CDR) circuit based on a digital phase interpolator and digital loop filter. To provide 28-GHz clock signals to both channels, a clock distribution circuit is integrated. Experimental results show that the proposed de-serializer recovers a 56Gb/s PAM-4 input signal with channel loss, achieving an output swing of 1.01-Vppd and 760ps RMS jitter.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131386439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Tunable Monopole Antenna for 5G Communication Applications","authors":"Liangfan Chen, Lu Zhao, Zihao Chen","doi":"10.1109/ICTA56932.2022.9962969","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9962969","url":null,"abstract":"A dual-band tunable monopole antenna is designed for 5G communication applications. The devised tuner consists of RF switch and RF capacitors of 0.3 pF, 0.5 pF, 1 pF, 2 pF and 5 pF, which enables the monopole antenna to be operated in different frequency bands. The proposed antenna is fabricated and measured. The measured -10 dB input impedance bandwidths of the proposed antenna are 1.32 GHz - 1.95 GHz and 1.98 GHz - 5.02 GHz, which can fully cover the 5G frequency spectrum in China.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132760444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A High PSR and Fast Transient Response Output-Capacitorless LDO using Gm-Boosting and Capacitive Bulk-Driven Feed-Forward Technique in 22nm CMOS","authors":"Heng Liu, Dongxu Li, Xian Tang","doi":"10.1109/ICTA56932.2022.9963003","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963003","url":null,"abstract":"This paper presents an output-capacitorless low-dropout regulator (OCL-LDO) using capacitive bulk-driven feed-forward (CBDFF) technique and an adaptive-biasing error amplifier with gm-boosting to enhance the power supply rejection (PSR) and the transient response. The proposed OCL-LDO has been implemented in a 22nm CMOS technology. It consumes a quiescent current of 49 µA from a power supply of 1.05-1.25 V and has a dropout voltage of 200 mV. The OCL-LDO achieves -84 dB PSR at low frequency and -69 dB PSR at 1 MHz for the load current of 20 mA. It achieves a line regulation of 0.18 mV/V, a load regulation of 0.77 µV/mA, and a settling time of 135 ns.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117002343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large Suppression to Lateral Charge Migration (LCM) Related Error Bits in Charge-Trap TLC 3D NAND Flash","authors":"Kenie Xie, Pena Guo, Fei Chen, Binglu Chen, Xiaotong Fang, Jixuan Wu, Xuepeng Zhan, Jiezhi Chen","doi":"10.1109/ICTA56932.2022.9962997","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9962997","url":null,"abstract":"We present a study to suppress error bits from lateral charge migration (LCM) in charge-trap (CT) 3D NAND flash memory. For the first time, a new Baking-and-Pre-read (BPR) method is proposed with combined long-time charge diffusion by baking and short-time stabilizing by Pre-read. By characterizing 96-layer Triple-level-cell (TLC) 3D NAND chips by the raw NAND chip tester, the storage stabilities, including data retention (DR) and read disturb (RD), are studied and it is found that DR/RD error bits can be reduced up to >70%, which could be explained by the large effects of suppression to LCM-related threshold voltage (Vth) down-shifts.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117174482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 64Gb/s PAM-4 Digital Equalizer With Tap-Configurable FFE and Partially Unrolled DFE in 28nm CMOS","authors":"Xinjie Feng, Yong-Nan Chen, Youzhi Gu, Jiangfeng Wu","doi":"10.1109/ICTA56932.2022.9963099","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963099","url":null,"abstract":"This paper presents a high-performance digital equalizer with four-level pulse amplitude modulation (PAM-4) for 64Gb/$s$ backplane I/Os. The digital equalizer consists of a tap-configurable feed-forward equalizer (FFE) and a partially unrolled decision-feedback equalizer (DFE). The first two post-cursor is covered by DFE and then FFE follows, which can largely reduce the influence of noise and crosstalk. The configurable FFE taps enable better adaption for different kind of channels. In order to optimize the internal algorithm, the look-up table (LUT) is used in both FFE and DFE. And the DFE is unrolled for timing closing using a new architecture introduced in this paper. Fabricated in 28nm CMOS, the digital equalizer operates at 64Gb/s with only 5pJ/bit power consumption at 1V.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117262143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Fully-Connected and Area-Efficient Ising Model Annealing Accelerator for Combinatorial Optimization Problems","authors":"Yukang Huang, Dong Jiang, Yongkui Yang, Enyi Yao","doi":"10.1109/ICTA56932.2022.9963022","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963022","url":null,"abstract":"The combinatorial optimization problem is ubiquitously in our daily life and typically inefficient for modern Von Neumann architecture-based computer. Targeting for various combinatorial optimization problems, this paper presents a 10K-bit area-efficient architecture of the domain specific accelerator based on fully-connected Ising model using an FPGA platform. The proposed system is based on simulated annealing algorithm with a spin preselection scheme to prevent the system to be trapped in the local minimum and increase the convergence efficiency, which is more easily and efficiently to be hardware implemented. Using max-cut problem as the experiment benchmark, the proposed hardware architecture achieves an acceleration of 50,000 × compared with the software simulation result.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123502538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}