{"title":"Neuromorphic Recurrent Spiking Neural Networks for EMG Gesture Classification and Low Power Implementation on Loihi","authors":"Ahmed Shaban, S. S. Bezugam, M. Suri","doi":"10.1109/ISCAS46773.2023.10181510","DOIUrl":"https://doi.org/10.1109/ISCAS46773.2023.10181510","url":null,"abstract":"In this work, we show an efficient Electromyograph (EMG) gesture recognition using Double Exponential Adaptive Threshold (DEXAT) neuron based Recurrent Spiking Neural Network (RSNN). Our network achieves a classification accuracy of 90% while using lesser number of neurons compared to the best reported prior art on Roshambo EMG dataset. Further, to illustrate the benefits of dedicated neuromorphic hardware, we show hardware implementation of DEXAT neuron using multicompartment methodology on Intel's neuromorphic Loihi chip. RSNN implementation on Loihi (Nahuku 32) achieves significant energy/latency benefits of ~983X/19X compared to GPU for batch size = 50.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116182400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruiqi Chen, Haoyang Zhang, Yu Li, Runzhou Zhang, Guoyu Li, Jun Yu, Kun Wang
{"title":"Edge FPGA-based Onsite Neural Network Training","authors":"Ruiqi Chen, Haoyang Zhang, Yu Li, Runzhou Zhang, Guoyu Li, Jun Yu, Kun Wang","doi":"10.1109/ISCAS46773.2023.10181582","DOIUrl":"https://doi.org/10.1109/ISCAS46773.2023.10181582","url":null,"abstract":"Conjugate gradient (CG) is widely used in training sparse neural networks. However, CG, involving a large amount of sparse matrix and vector operations, cannot be efficiently implemented on resource-limited edge devices. In this paper, a high-performance and energy-efficient CG accelerator implemented on edge Field Programmable Gate Array is proposed for fast onsite neural networks training. According to the profiling, we propose a unified matrix multiplier that is compatible with the sparse and dense matrix. We also design a novel T-engine to handle transpose operation with the compressed sparse format. Experimental results show that our proposal outperforms the state-of-the-art FPGA work with a resource reduction of up to 41.3%. In addition, we achieve on average $10.2times$ and $2.0times$ speedup, while $10.1times$ and $3.5times$ better energy efficiency than implementations on CPU and GPU, respectively.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123718579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikos Temenos, V. Ntinas, P. Sotiriadis, G. Sirakoulis
{"title":"Time-based Memristor Crossbar Array Programming for Stochastic Computing Parallel Sequence Generation","authors":"Nikos Temenos, V. Ntinas, P. Sotiriadis, G. Sirakoulis","doi":"10.1109/ISCAS46773.2023.10181967","DOIUrl":"https://doi.org/10.1109/ISCAS46773.2023.10181967","url":null,"abstract":"The so far dominant Von Neumann architecture is being challenged by the energy demanding communication bottle-neck between processing and memory units. To address this issue, in-memory computing is employed for their co-location, with memristive crossbar arrays playing an important role towards this goal. Motivated by the above, this work introduces a timing-based programming of a memristor crossbar array for sequence generation in Stochastic Computing (SC). Its operation principle is based on the stochastic nature of the memristor devices forming the crossbar array, where their programming is regulated by the switching probability that follows the Poisson distribution, controlled by pulse amplitude and duration. The timing-based programming of the proposed crossbar array increases the discretization levels of the output probability values, thereby offering more accurate control when compared to programming schemes that consider only the pulse amplitude. The memristor's stochasticity along with the crossbar's inherent parallelism opens the in-memory design space allowing SC elements to be used as sequences are generated efficiently. Simulation results on different programming pulse-width precisions highlight the proposed crossbar's effectiveness in sequence generation, supported by mean absolute error (MAE) results in a standard SC arithmetic operation. Process variations stemming from the crossbar array affecting the sequence generation in SC are investigated.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116824560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Licciardo, P. Vitolo, S. Bosco, Santo Pennino, D. Pau, M. Pesaturo, L. D. Benedetto, R. Liguori
{"title":"Ultra-Tiny Neural Network for Compensation of Post-soldering Thermal Drift in MEMS Pressure Sensors","authors":"G. Licciardo, P. Vitolo, S. Bosco, Santo Pennino, D. Pau, M. Pesaturo, L. D. Benedetto, R. Liguori","doi":"10.1109/ISCAS46773.2023.10181480","DOIUrl":"https://doi.org/10.1109/ISCAS46773.2023.10181480","url":null,"abstract":"MEMS pressure sensors are widely used in several application fields, such as industrial, medical, automotive, etc, where they are required to be increasingly accurate and reliable. However, these sensors are very sensitive to mechanical and temperature variations. For example, the soldering process, which involves significant thermal stress, causes drift in the sensor accuracy. This article introduces a digital circuit implementing a very tiny neural network able to compensate for the drift measurement in real time. The circuit is capable of correcting for drift accuracy up to 1.6 hPa, restoring the accuracy to $pm 0.5 text{hPa}$. Synthesis results on TSMC 130 nm CMOS technology show an area occupation of 0.0373 $text{mm}^{2}$ and a dynamic power of 1.07 $mu mathrm{W}$, which enable its easy integration in the digital circuit which is available into MEMS sensor package for pressure measures conditioning.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117212920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Pfau, R. Leys, Marc Neu, Alexey Serdyuk, Ivan Peric, J. Becker
{"title":"A Unified SoC Lab Course: Combined Teaching of Mixed Signal Aspects, System Integration, Software Development and Documentation","authors":"J. Pfau, R. Leys, Marc Neu, Alexey Serdyuk, Ivan Peric, J. Becker","doi":"10.1109/ISCAS46773.2023.10181679","DOIUrl":"https://doi.org/10.1109/ISCAS46773.2023.10181679","url":null,"abstract":"University courses for System-on-Chip (SoC) design mostly focus on particular aspects. Whereas this can provide detailed understanding of these aspects, it neglects system integration specific topics such as crossing digital and analog domains. In addition, many courses skip practical issues and do not teach Electronic Design Automation (EDA) tools. This is reasonable in the context of a specialized course, but omitting these techniques often prevents students from making active use of the learned knowledge in their own projects. In the following, we present our technological platform to teach SoC design in a holistic lab course: The course takes students from writing the first line of Verilog code to advanced digital and analog design. It introduces simulation of digital and analog systems, debugging methods for software and hardware, CPU bus architecture, custom peripherals and driver development. This work is prototyped using Field Programmable Gate Arrays (FPGAs) and later transferred to an ASIC target, covering standard cell synthesis and analog layout. At the end of the semester, students finalize the project with technical documentation writing. The course is built on the design of an audio peripheral, combining all topics in a single real-world system. It enables students to apply theoretical aspects from various lectures in the SoC curriculum in practice and equips them with the skills needed to dive deeper into each of the involved topics on their own.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"252 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123940519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 68-85GHz Current-Combining Power Amplifier with 20% PAE and 17dBm Psat in 40-nm CMOS","authors":"Jiaqin Fang, Guangyin Feng, Yanjie Wang","doi":"10.1109/ISCAS46773.2023.10181840","DOIUrl":"https://doi.org/10.1109/ISCAS46773.2023.10181840","url":null,"abstract":"This paper presents a 77GHz power amplifier (PA) in 40nm CMOS technology for automotive radars. A symmetrical current-mode power combiner is adopted in the PA design to extend bandwidth, enrich efficiency, and maximize output power. The small-signal gain of the PA is 16.4dB with a 3-dB bandwidth from 68 to 85GHz. The proposed PA achieves a saturated output power (Psat) of 17dBm, Psat-1dB bandwidth of 30GHz from 70-100GHz, and peak power-added efficiency (PAE) of 20%.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123977598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xue Yuan, Kun Su, Jingyi He, Shi Xu, Jieyu Li, Weifeng He
{"title":"An Area-Efficient Single-Phase-Clocked and Contention-Free Flip-Flop for Ultra-Low-Voltage Operations","authors":"Xue Yuan, Kun Su, Jingyi He, Shi Xu, Jieyu Li, Weifeng He","doi":"10.1109/ISCAS46773.2023.10181517","DOIUrl":"https://doi.org/10.1109/ISCAS46773.2023.10181517","url":null,"abstract":"This paper proposes an area-efficient single-phase-clocked and contention-free flip-flop (FF) targeting ultra-low-voltage (ULV) operations, named TSPC20. To ensure reliable operations in ULV regime, we eliminate all the contention paths and cut off the longest hold time path consisting of three stacking transistors in the conventional single-phase-clocked FF (TSPC18). Moreover, to further reduce the area and power consumption, we remove the redundant transistors through transistor merging and logical expression reorganization. TSPC20, with only 20 transistors, is the most area-efficient FF compared to prior FFs that can operate in ULV regime. Post-layout simulations with 28nm process shows that TSPC20 achieves 48% (54%) power reduction at 0.3V/6Mhz (0.9V/1.4GHz) considering 10% data activity ratio, compared to the conventional transmission-gate flip-flop (TGFF). The 1K Monte Carlo simulations verify that TSPC20 is functional down to 0.3V.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125048559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CKKS-Based Homomorphic Encryption Architecture using Parallel NTT Multiplier","authors":"T. Tan, Jisu Kim, Hanho Lee","doi":"10.1109/ISCAS46773.2023.10181714","DOIUrl":"https://doi.org/10.1109/ISCAS46773.2023.10181714","url":null,"abstract":"This paper presents a high-throughput CKKS-based encryption architecture for homomorphic encryption. By de-ploying a parallel number theoretic transform (NTT) multiplier architecture, the polynomial multiplication is significantly accel-erated. Additionally, the modular multiplier is also improved by efficiently implementing using digital signal processing resources. The proposed NTT multiplier and homomorphic encryption architecture are evaluated using Xilinx Vivado and Xilinx XCU250 FPGA board. The evaluation results demonstrate that the proposed NTT multiplier helps improve the throughput of polynomial multiplication by at least 1.5 x compared to the most recent works. The efficiency of the proposal NTT multiplier, calculated by throughput per L UT or Slice, is much better than that of existing studies. The proposed homomorphic encryption architecture using the proposed NTT multiplier offers a high throughput of 32.7 Gbps.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125077719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziyang Shen, Fengshi Tian, Jingwen Jiang, Chaoming Fang, X. Xue, Jie Yang, M. Sawan
{"title":"NBSSN: A Neuromorphic Binary Single-Spike Neural Network for Efficient Edge Intelligence","authors":"Ziyang Shen, Fengshi Tian, Jingwen Jiang, Chaoming Fang, X. Xue, Jie Yang, M. Sawan","doi":"10.1109/ISCAS46773.2023.10181850","DOIUrl":"https://doi.org/10.1109/ISCAS46773.2023.10181850","url":null,"abstract":"Neuromorphic computing approaches such as Spiking Neural Networks (SNN) have been increasingly adopted in bio-signal processing and interpretation due to its intrinsic neurodynamic attribute. Nevertheless, reconciling performance and power efficiency in SNN implementation is still a bottleneck. Single-spike neural coding scheme, which is an extremely sparse coding scheme, provides a solution to bridge the gap. In this work, a neuromorphic architecture, using binary single spike neural signals, is proposed with both algorithm and hardware implementation. A sparsity-aware spatial-temporal back-propagation training method is proposed together with a single-spike coding scheme. Also, a novel neuromorphic accelerator is co-designed with algorithmic optimization and implemented in 40nm CMOS process. Experimental results show that the proposed processor reaches an accuracy of 94.61% on the MNIST dataset, 93.59% on the N-MNIST dataset, and 93.27% on the ECG dataset, respectively, while consumes $0.173mumathrm{J}$ per ECG classification task and 0.16mm2 on-chip area. The overall power consumption is reduced by 91.68% compared to the state-of-the-art systems.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129389473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ning Pu, Kaiji Liu, Heyue Li, Nan Wu, Yaoyu Li, Wen Jia, Zhihua Wang, Hanjun Jiang
{"title":"Resource-efficient Face Detector Using 1.5-bit Frame-to-frame Delta Quantization for Image Based Always-on Wake-up Application","authors":"Ning Pu, Kaiji Liu, Heyue Li, Nan Wu, Yaoyu Li, Wen Jia, Zhihua Wang, Hanjun Jiang","doi":"10.1109/ISCAS46773.2023.10182186","DOIUrl":"https://doi.org/10.1109/ISCAS46773.2023.10182186","url":null,"abstract":"A resource-efficient neural-network-based face detector using 1.5-bit frame-to-frame delta quantization with diagonal spatial feature extraction method is proposed in this paper, which is designed for resource-limited always-on camera sensors. The proposed architecture completes analog-domain frame difference for motion sensing, which triggers digital-domain feature extraction. Based on the sparse and effective features, a lightweight convolutional neural network is devised as a classifier. A self-recorded dataset of 313 videos for humans of different appearances, light intensity and backgrounds is used to validate the performance of the proposed method. Simulation results show that the proposed method achieves 93.6% accuracy using only a $boldsymbol{50times 50}$ pixel array, which is higher than the prior discontinuous temporal change quantization method. Meanwhile, the conservatively estimated power consumption of the proposed method can be reduced by $mathbf{14 times}$ compared to the state-of-the-art work.","PeriodicalId":177320,"journal":{"name":"2023 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129888040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}