{"title":"Analysis and Design of Wideband Active Single-Sideband Time Modulator in 0.13-$mu$m CMOS","authors":"Guoxiao Cheng, Jin-Dong Zhang, Qiaoyu Chen, Wen Wu","doi":"10.1109/tcsi.2024.3456237","DOIUrl":"https://doi.org/10.1109/tcsi.2024.3456237","url":null,"abstract":"","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"100 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142254108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NASH: Neural Architecture and Accelerator Search for Multiplication-Reduced Hybrid Models","authors":"Yang Xu;Huihong Shi;Zhongfeng Wang","doi":"10.1109/TCSI.2024.3457628","DOIUrl":"https://doi.org/10.1109/TCSI.2024.3457628","url":null,"abstract":"The significant computational cost of multiplications hinders the deployment of deep neural networks (DNNs) on edge devices. While multiplication-free models offer enhanced hardware efficiency, they typically sacrifice accuracy. As a solution, multiplication-reduced hybrid models have emerged to combine the benefits of both approaches. Particularly, prior works, i.e., NASA and NASA-F, leverage Neural Architecture Search (NAS) to construct such hybrid models, enhancing hardware efficiency while maintaining accuracy. However, they either entail costly retraining or encounter gradient conflicts, limiting both search efficiency and accuracy. Additionally, they overlook the acceleration opportunity introduced by accelerator search, yielding sub-optimal hardware performance. To overcome these limitations, we propose NASH, a Neural architecture and Accelerator Search framework for multiplication-reduced Hybrid models. Specifically, as for NAS, we propose a tailored zero-shot metric to pre-identify promising hybrid models before training, enhancing search efficiency while alleviating gradient conflicts. Regarding accelerator search, we innovatively introduce coarse-to-fine search to streamline the search process. Furthermore, we seamlessly integrate these two levels of searches to unveil NASH, obtaining optimal model and accelerator pairing. Experiments validate our effectiveness, e.g., when compared with the state-of-the-art multiplication-based system, we can achieve \u0000<inline-formula> <tex-math>$uparrow 2.14times $ </tex-math></inline-formula>\u0000 throughput and \u0000<inline-formula> <tex-math>$uparrow 2.01times $ </tex-math></inline-formula>\u0000 FPS with \u0000<inline-formula> <tex-math>$uparrow 0.25%$ </tex-math></inline-formula>\u0000 accuracy on CIFAR-100, and \u0000<inline-formula> <tex-math>$uparrow 1.40times $ </tex-math></inline-formula>\u0000 throughput and \u0000<inline-formula> <tex-math>$uparrow 1.19times $ </tex-math></inline-formula>\u0000 FPS with \u0000<inline-formula> <tex-math>$uparrow 0.56%$ </tex-math></inline-formula>\u0000 accuracy on Tiny-ImageNet. Codes are available at \u0000<uri>https://github.com/xuyang527/NASH</uri>\u0000.","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"71 12","pages":"5956-5968"},"PeriodicalIF":5.2,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142736535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongxin Li, Chunbiao Li, Sen Zhang, Yuanjin Zheng, Guanrong Chen
{"title":"Offset Boosting-Oriented Construction of Multi-Scroll Attractor via a Memristor Model","authors":"Yongxin Li, Chunbiao Li, Sen Zhang, Yuanjin Zheng, Guanrong Chen","doi":"10.1109/tcsi.2024.3455350","DOIUrl":"https://doi.org/10.1109/tcsi.2024.3455350","url":null,"abstract":"","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"16 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142254113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LLD: Lightweight Latency Decrease Scheme of LDPC Hard Decision Decoding for 3-D TLC NAND Flash Memory","authors":"Debao Wei;Yongchao Wang;Hua Feng;Huqi Xiang;Liyan Qiao","doi":"10.1109/TCSI.2024.3438789","DOIUrl":"https://doi.org/10.1109/TCSI.2024.3438789","url":null,"abstract":"The low-density parity-check code (LDPC) has been widely used to significantly enhance the reliability of 3-D NAND flash memory. However, in cases where the raw bit error rate (RBER) of the data is high, it not only demands more sense levels but also requires a large number of iterations, leading to a notable read latency issue. To mitigate this challenge, this paper introduces an innovative lightweight latency decrease (LLD) scheme. Initially, by examining the correlation between the number of iterations and the hard decision level (HDL), a functional model that encapsulates the relationship between iteration and offset is established. Building upon this model, the all-wordlines latency decrease (AWLD) scheme is proposed. In an effort to further decrease latency, an in-depth analysis of the similarities among different wordlines within a flash memory block is conducted, leading to the development of an optimized one-wordline lightweight latency decrease (OWLLD) scheme. For scenarios involving random reading of small data volumes, the interplay between function models of various overlapping regions is delved into, which ultimately results in the proposal of a further optimized one-page lightweight latency decrease (OPLLD) scheme. Experimental findings reveal that the OPLLD scheme can enhance the iterative performance of LDPC by up to 94.63% and reduce latency by up to 66.89% compared to traditional algorithms, while incurring minimal storage and computational overhead. This clearly indicates that the proposed scheme substantially enhances the read latency performance of LDPC in flash memory.","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"71 10","pages":"4611-4623"},"PeriodicalIF":5.2,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142376906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinchi Xu, Yonggang Wang, Yonghang Zhou, Zhengqi Song, Bo Wu, Xin Lin
{"title":"A 12 ps Precision Two-Step Time-to-Digital Converter Consuming 434 $mu$W at 1 MS/s in 180 nm CMOS With a Dual-Slope Time Amplifier","authors":"Xinchi Xu, Yonggang Wang, Yonghang Zhou, Zhengqi Song, Bo Wu, Xin Lin","doi":"10.1109/tcsi.2024.3454793","DOIUrl":"https://doi.org/10.1109/tcsi.2024.3454793","url":null,"abstract":"","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"14 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142254109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Real-Time and High Precision Hardware Implementation of RANSAC Algorithm for Visual SLAM Achieving Mismatched Feature Point Pair Elimination","authors":"Wenzheng He;Zikuo Lu;Xin Liu;Ziwei Xu;Jingshuo Zhang;Chen Yang;Li Geng","doi":"10.1109/TCSI.2024.3422082","DOIUrl":"10.1109/TCSI.2024.3422082","url":null,"abstract":"The visual SLAM (vSLAM) algorithm is becoming a research hotspot in recent years because of its low cost and low delay. Due to the advantage of fitting irregular data input, random sample consensus (RANSAC) has become a commonly used method in vSLAM to eliminate mismatched feature point pairs in adjacent frames. However, the huge number of iterations and computational complexity of the algorithm make the hardware implementation and integration of the entire system challenging. This paper pioneeringly proposes an efficient hardware acceleration design with homography matrix as RANSAC hypothesis model, which achieves high speed and high precision. Through optimizing the direct linear transformation (DLT) method, the delay and resource consumption are reduced. The design is implemented on FPGA. Through the verification of Xilinx Zynq 7100 platform, the processing frame rate on EuRoc dataset is 709 fps, reaching an average speed up of \u0000<inline-formula> <tex-math>$263.2times $ </tex-math></inline-formula>\u0000 against ARM CPU, and a speed up of \u0000<inline-formula> <tex-math>$1.2sim 50.0times $ </tex-math></inline-formula>\u0000 compared with the advanced implementations in RANSAC part, which fully meets the real-time requirements. In addition, the root-mean-square error (RMSE) based on an open-source SLAM system (ICE-BA) on the EuRoc dataset reached 0.105 m, achieving an improvement of 15.6% in precision compared to the original ICE-BA system.","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"71 11","pages":"5102-5114"},"PeriodicalIF":5.2,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Event-Triggered Control for PDE-ODE Cascade Systems via Hierarchical Sliding Mode","authors":"Shanlin Liu, Yingwei Zhang, Xudong Zhao","doi":"10.1109/tcsi.2024.3446621","DOIUrl":"https://doi.org/10.1109/tcsi.2024.3446621","url":null,"abstract":"","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"36 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Specific ADC of NVM-Based Computation-in-Memory for Deep Neural Networks","authors":"Ao Shi;Yizhou Zhang;Lixia Han;Zheng Zhou;Yiyang Chen;Haozhang Yang;Lifeng Liu;Linxiao Shen;Xiaoyan Liu;Jinfeng Kang;Peng Huang","doi":"10.1109/TCSI.2024.3430290","DOIUrl":"10.1109/TCSI.2024.3430290","url":null,"abstract":"Non-volatile memory (NVM)-based Computation-in-memory has demonstrated a significant advantage in high-efficiency neural networks. However, the requirement of analog-to-digital converter (ADC) and post-processing circuits not only cost high energy and area but also results in high computation errors, which tradeoffs the performance boost brought by CIM. Here, we present a specific ADC and post-processing circuit of the NVM-based CIM neural network to address these issues. The main contributions include: (1) A novel residual charge accumulation function (RCA) is designed to achieve charge-domain summation of quantized partial sum and reduces 38% quantization error; (2) Charge reset is introduced in the integrate & fire circuit to realize <1> <tex-math>$3.95times $ </tex-math></inline-formula>\u0000 energy efficiency and \u0000<inline-formula> <tex-math>$2.48times $ </tex-math></inline-formula>\u0000 area efficiency. Evaluation based on the measured results of the fabricated chip shows that the VGG-11 neural network with the proposed ADC circuit can achieve a 3.28-time improvement in energy efficiency while maintaining the same network recognition rate.","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"71 12","pages":"5387-5399"},"PeriodicalIF":5.2,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gianluca Leone, Matteo Antonio Scrugli, Lorenzo Badas, Luca Martis, Luigi Raffo, Paolo Meloni
{"title":": A Tiny RISC-V-Controlled SNN Processor for Real-Time Sensor Data Analysis on Low-Power FPGAs","authors":"Gianluca Leone, Matteo Antonio Scrugli, Lorenzo Badas, Luca Martis, Luigi Raffo, Paolo Meloni","doi":"10.1109/tcsi.2024.3450966","DOIUrl":"https://doi.org/10.1109/tcsi.2024.3450966","url":null,"abstract":"","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"82 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142179161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}