Wenjie Zhou, Haoyan Qi, D. Boland, Philip H. W. Leong
{"title":"FPGA Implementation of N-BEATS for Time Series Forecasting Using Block Minifloat Arithmetic","authors":"Wenjie Zhou, Haoyan Qi, D. Boland, Philip H. W. Leong","doi":"10.1109/APCCAS55924.2022.10090282","DOIUrl":"https://doi.org/10.1109/APCCAS55924.2022.10090282","url":null,"abstract":"The block minifloat (BM) number format uses an 8-bit floating point format with additional shared exponent bias to enable low-precision representation with large dynamic range. While it has been shown that the BM format can support low-precision training of convolutional neural networks such as ResNet on ImageNet at precisions down to 6 bits, its applicability to inference-only applications has not been studied. We present a BM implementation of N-BEATS, a deep neural architecture for univariate time series forecasting. N-BEATS utilises residual and fully connected (FC) blocks to achieve high accuracy. It was found that 8-bit BM had similar area and speed as 8-bit integer arithmetic with NBEATS accuracy similar to 16-bit floating point.","PeriodicalId":243739,"journal":{"name":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"253 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115281181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Third-Order CIFF Noise-Shaping SAR ADC with Nonbinary Split-Capacitor DAC","authors":"Peng Zhang, Xiaoyong He, Shuhao Lai, Zehui Wu","doi":"10.1109/APCCAS55924.2022.10090292","DOIUrl":"https://doi.org/10.1109/APCCAS55924.2022.10090292","url":null,"abstract":"A third-order CIFF noise-shaping SAR ADC is proposed in this paper. Aiming at the input-referred noise of the multi-input comparator, stacking capacitors are used to realize the addition of the input signal and integrated residual voltages. To reduce area and power consumption, a nonbinary spilt-capacitor DAC is proposed. The DAC input capacitor is 0.6pF. A sampling rate of 5MS/s ADC is designed in 130nm process. The simulation results show that with the oversampling ratio of 8, the ADC achieves 80.2dB SNDR and 86.8dB SFDR, and the ENOB is 13bits. The total power consumption of the ADC is about 607μW.","PeriodicalId":243739,"journal":{"name":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131558608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accurate Geometric Programming-Compatible Slew Rate Modeling for Two-Stage Operational Amplifier Design Optimization","authors":"Eric J. Wyers","doi":"10.1109/APCCAS55924.2022.10090335","DOIUrl":"https://doi.org/10.1109/APCCAS55924.2022.10090335","url":null,"abstract":"Monomial models for the two-stage operational amplifier positive and negative slew rate performances are proposed in this work to aid the integrated circuit designer in producing optimal designs within the geometric programming design framework. Compared to the commonly-used and inaccurate slew rate design equations, the developed slew rate monomial models are capable of producing designs which have excellent slew rate performance agreement between the design optimization framework and circuit simulation, are based on highly-accurate slew rate design equations, require minimal overhead to produce, and have minimal modeling complexity with respect to the number of parameters to be estimated. We demonstrate the efficacy of the proposed slew rate models via a design test case in a standard 1.8-V, $0.18-mu mathbf{m}$ CMOS technology.","PeriodicalId":243739,"journal":{"name":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127017891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hui-qin Li, Tao Chen, Aijun Wu, Chao-xing Xu, Wei Li, Longmei Nan
{"title":"High Efficient Architecture of Polynomial Multiplier with Variable Parameter Based on 2KNTT","authors":"Hui-qin Li, Tao Chen, Aijun Wu, Chao-xing Xu, Wei Li, Longmei Nan","doi":"10.1109/APCCAS55924.2022.10090300","DOIUrl":"https://doi.org/10.1109/APCCAS55924.2022.10090300","url":null,"abstract":"In July 2022, among the finalists for the fourth round of NIST's post-quantum public key cryptography, lattice-based algorithms using NTT to implement polynomial multiplication are CRYSTALS-KYBER and CRYSTALS-Dilithium. Therefore, in this paper, we design an efficient structure for polynomial multiplication based on the 2KNTT method for the purpose of improving the practical performance and satisfying variable parameters for these two algorithms. According to the existing 2KNTT algorithm, an eight-way parallel practical model that adapts to the requirements of the algorithm is designed under the premise of determining the storage granularity in advance. Specifically, the modulo multiplication unit can meet the operation of different moduli at the same time, and the control unit can meet the requirements of different number of terms. Experimental results show that this design can meet the polynomial multiplication with modulus 12~32 bits, term number 128, 256, 512, 1024, and modulus polynomial number $x^{n}+1$, in which it takes 2052 cycles to perform a polynomial multiplication operation with the maximum parameter (n=1024, q=8380417).","PeriodicalId":243739,"journal":{"name":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131880179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High Precision Hysteresis Controlled MPPT Circuit for Vibration Energy Harvesting","authors":"Chunbiao Pan, Yidie Ye, Huakang Xia","doi":"10.1109/APCCAS55924.2022.10090322","DOIUrl":"https://doi.org/10.1109/APCCAS55924.2022.10090322","url":null,"abstract":"A high precision hysteresis controlled maximum power point tracking (HPHC-MPPT) circuit for vibration energy harvesting is proposed in this paper. It is realized with the fractional open-circuit voltage (FOCV) MPPT method, and a hysteresis comparator based controller is used to make a precise hysteresis voltage window, which is including 1/2 open circuit voltage of the piezoelectric transducer (PZT) voltage after rectifier. An adaptive sampling rate controller is added to reduce the sampling power consumption. Energy management unit is used to transfer energy from rectifier output capacitor to energy storage capacitor with high efficiency. The proposed circuit is designed and simulated with SMIC 0.18μm process. The results show that the MPPT efficiency is high to 99.45%, and the maximum conversion efficiency can reach 94.2% with a wide variations of the input vibration energy and the system load.","PeriodicalId":243739,"journal":{"name":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133809042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiao Han, Xiyuan Tang, Yanxing Suo, Qiao Cai, Xinzi Xu, T. Wan, Yang Zhao
{"title":"A Vector Pair Based DWA Algorithm for Linearity Enhancement of CDACs in the NS-SAR ADC","authors":"Xiao Han, Xiyuan Tang, Yanxing Suo, Qiao Cai, Xinzi Xu, T. Wan, Yang Zhao","doi":"10.1109/APCCAS55924.2022.10090367","DOIUrl":"https://doi.org/10.1109/APCCAS55924.2022.10090367","url":null,"abstract":"The conventional DWA algorithm is not direct to be used for the mismatch shaping of the SAR-type DACs that are widely used in NS-SAR ADCs. The emerging DWA algorithm requires an extra coarse ADC which complicates the system and thus limits the figure of merit of the high-resolution NS-SAR ADC. This paper presents vector pair based DWA algorithm omitting the necessity of the coarse ADC. Employing two vectors that initiate at adjacent positions and respectively update to circularly select the elements in a reversed direction, then the mismatch error of the SAR-type DAC is first-order shaped. Not only the theoretical proof is given but also the shaping performance is simulated in this paper.","PeriodicalId":243739,"journal":{"name":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"31 5-6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114135994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinxuan Liang, Zhuoxuan Zhu, Yida Dong, Lei Lei, Yida Li, Mei Shen, Xiaozhen Xiang
{"title":"Simulation of Gap Structure Optical Waveguide with Phase Change Materials","authors":"Jinxuan Liang, Zhuoxuan Zhu, Yida Dong, Lei Lei, Yida Li, Mei Shen, Xiaozhen Xiang","doi":"10.1109/APCCAS55924.2022.10090290","DOIUrl":"https://doi.org/10.1109/APCCAS55924.2022.10090290","url":null,"abstract":"Optical phase change material (O-PCM) have attracted extensive attention due to its large change in optical properties between phase transitions. Its application in rewritable non-volatile photonic devices is extremely interesting except the drawback of moderate loss induced by the O-PCM. In this work, a “gap” structure waveguide device with an O-PCM layer is proposed to simulate the optical propagation in both amorphous and crystalline state. The effective refractive indices <tex>$(mathbf{n}_{text{eff}})$</tex>, losses <tex>$(alpha)$</tex> and figure-of-merits (FOMs) are obtained by simulations for Ge<inf>2</inf>Sb<inf>2</inf>Te<inf>5</inf> (GST), GeS, Sb<inf>2</inf>S<inf>3</inf>, Sb<inf>2</inf>Se<inf>3</inf> and Ge<inf>2</inf>Sb<inf>2</inf>Se<inf>4</inf>Te (GSST). The results indicate the “gap” structure have effectively mitigated the propagation loss induced by the O-PCM. By comparing the results for different materials, we find that GeS and Sb<inf>2</inf>S<inf>3</inf> are the most effective modulation candidates in the propagation of TM-polarized and TE-polarized light, respectively.","PeriodicalId":243739,"journal":{"name":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117149723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A High-Speed NTT-Based Polynomial Multiplication Accelerator with Vector Extension of RISC-V for Saber Algorithm","authors":"Honglin Kuang, Yifan Zhao, Jun Han","doi":"10.1109/APCCAS55924.2022.10090293","DOIUrl":"https://doi.org/10.1109/APCCAS55924.2022.10090293","url":null,"abstract":"Saber is a module-learning with rounding-based post-quantum cryptography (PQC) scheme for key encapsulation mechanism (KEM). It is characterized by the use of power-of-two moduli, which makes all modulus reductions free in hardware. However, such a decision prevents the direct implementation of the asymptotically fastest number theoretic transform (NTT) for the time-consuming polynomial multiplication in Saber. To efficiently multiply polynomials, researches have been done using a schoolbook or Toom-Cook or Karatsuba algorithm. Though these approaches result in decent operating speed at moderate area cost, they are disadvantageous when considering expanding the system to support multiple PQC protocols. To enable NTT for Saber, we choose an appropriate prime and use the sign-magnitude format for computation. A concise and efficient vectorized NTT algorithm has been proposed, based on which we design a configurable vector NTT unit to perform NTT and other arithmetic operations. The accelerator is dedicatedly pipelined to achieve high speed and is driven by custom vector instruction extension of RISC-V. We implement the proposed architecture with vector lanes of 32 and 16 on Xilinx UltraScale+ ZCU111. Results show that our design can achieve up to $5mathrm{x}$ and $3mathrm{x}$ improvement in computation time and area-time-product (ATP) respectively for degree-256 polynomials multiplication, compared to the state-of-the-art Saber polynomial multiplier counterparts.","PeriodicalId":243739,"journal":{"name":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115531931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Wireless Power Design with High PCE and Fast Transient Response over a Large Loading Range for Multi-channel Neural Stimulators","authors":"Weisong Liang, Xu Liu, Weijian Chen, Ze-Xi Lu, Peiyuan Wan, Zhijie Chen","doi":"10.1109/APCCAS55924.2022.10090269","DOIUrl":"https://doi.org/10.1109/APCCAS55924.2022.10090269","url":null,"abstract":"Brain-machine interface(BMI) with implantable bioelectronics systems can provide an alternative way to cure neural diseases, while a wireless power transfer (WPT) system plays an important role in providing a stable voltage supply for the implanted chip. A WPT for multichannel neural stimulators with high power conversion efficiency(PCE) and low power dissipation over a large loading range is proposed in this work. Both the internal Vth cancelation (IVC) and the dynamic bulk modulation (DBM) schemes are used to maximize the PCE of rectifiers. Besides, a reverse nested miller compensation (RNMC) LDO with a transient enhancer is proposed for the WPT system. Simulation results show that the total PCE is 55% at its peak, and the power consumption is 0.55 mW and 22.5 mW at standby (SB) and full stimulation (ST) load, respectively. For a full load transition, the overshoot and downshoot of the LDO are 110mV and 71 mV, respectively, which help improve the load transient response during neural stimulation.","PeriodicalId":243739,"journal":{"name":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123634735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ternary In-Memory MAC Accelerator With Dual-6T SRAM Cell for Deep Neural Networks","authors":"Xudong Wang, Ge Li, Jiacong Sun, Huanjie Fan, Yong Chen, Hailong Jiao","doi":"10.1109/APCCAS55924.2022.10090389","DOIUrl":"https://doi.org/10.1109/APCCAS55924.2022.10090389","url":null,"abstract":"In-memory computing (IMC) based on static random access memory (SRAM) is a promising solution to enable highly energy-efficient multiply-accumulate (MAC) operations for machine learning accelerators. In this paper, an in-SRAM computing technique is proposed by using a dual-six-transistor (dual-6T) SRAM cell. The dual-6T SRAM cell is composed of two conventional-6T-SRAM-cell-like 6T cells with split wordlines, achieving a compact array layout. With specialized coding, the dual-6T SRAM circuit is one of the few in-memory accelerators which support parallel MAC operations with both ternary activation and ternary weight. A $128times 64$ memory array is implemented in a 55-nm low-power CMOS technology. Due to the compact bitcell topology and smart coding, the proposed dual-6T memory array achieves up to 635 TOPS/W energy efficiency @ 100 MHz and 38.84 TOPS/mm2 peak area efficiency @ 350 MHz, which is competitive among the state-of-the-art in-memory computing MAC accelerators.","PeriodicalId":243739,"journal":{"name":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116354428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}