IEEE Transactions on Very Large Scale Integration (VLSI) Systems最新文献

筛选
英文 中文
A 578-TOPS/W RRAM-Based Binary Convolutional Neural Network Macro for Tiny AI Edge Devices 基于578 tops /W rram的微型AI边缘设备二进制卷积神经网络宏
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-10-08 DOI: 10.1109/TVLSI.2024.3469217
Lixun Wang;Yuejun Zhang;Pengjun Wang;Jianguo Yang;Huihong Zhang;Gang Li;Qikang Li
{"title":"A 578-TOPS/W RRAM-Based Binary Convolutional Neural Network Macro for Tiny AI Edge Devices","authors":"Lixun Wang;Yuejun Zhang;Pengjun Wang;Jianguo Yang;Huihong Zhang;Gang Li;Qikang Li","doi":"10.1109/TVLSI.2024.3469217","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3469217","url":null,"abstract":"The novel nonvolatile computing-in-memory (nvCIM) technology enables data to be stored and processed in situ, providing a feasible solution for the widespread deployment of machine learning algorithms in edge AI devices. However, current nvCIM approaches based on weighted current summation face challenges such as device nonidealities and substantial time, storage, and energy overheads when handling high-precision analog signals. To address these issues, we propose a resistive random access memory (RRAM)-based binary convolution macro for constructing a complete binary convolutional neural network (BCNN) hardware circuit, accelerating edge AI applications with low-weight precision. This macro performs error compensation at the circuit level and provides stable rail-to-rail output, eliminating the need for any ADCs or processor to perform auxiliary computations. Experimental results demonstrate that the proposed BCNN full-hardware computing system achieves on-chip recognition accuracy of 90.7% (98.64%) on the CIFAR10 (MNIST) dataset, which represents a decrease of 0.98% (0.59%) compared to software recognition accuracy. In addition, this binary convolution macro achieves a maximum throughput of 320 GOPS and a peak energy efficiency of 578 TOPS/W at 136 MHz.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"371-383"},"PeriodicalIF":2.8,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hardware-Accelerator Design by Composition: Dataflow Component Interfaces With Tydi-Chisel 硬件加速器的组合设计:数据流组件接口与Tydi-Chisel
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-10-04 DOI: 10.1109/TVLSI.2024.3461330
Casper Cromjongh;Yongding Tian;H. Peter Hofstee;Zaid Al-Ars
{"title":"Hardware-Accelerator Design by Composition: Dataflow Component Interfaces With Tydi-Chisel","authors":"Casper Cromjongh;Yongding Tian;H. Peter Hofstee;Zaid Al-Ars","doi":"10.1109/TVLSI.2024.3461330","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3461330","url":null,"abstract":"As dedicated hardware is becoming more prevalent in accelerating complex applications, methods are needed to enable easy integration of multiple hardware components into a single accelerator system. However, this vision of composable hardware is hindered by the lack of standards for interfaces that allow such components to communicate. To address this challenge, the Tydi standard was proposed to facilitate the representation of streaming data in digital circuits, notably providing interface specifications of composite and variable-length data structures. At the same time, constructing hardware in a Scala embedded language (Chisel) provides a suitable environment for deploying Tydi-centric components due to its abstraction level and customizability. This article introduces Tydi-Chisel, a library that integrates the Tydi standard within Chisel, along with a toolchain and methodology for designing data-streaming accelerators. This toolchain reduces the effort needed to design streaming hardware accelerators by raising the abstraction level for streams and module interfaces, hereby avoiding writing boilerplate code, and allows for easy integration of accelerator components from different designers. This is demonstrated through an example project incorporating various scenarios where the interface-related declaration is reduced by 6–14 times. Tydi-Chisel project repository is available at \u0000<uri>https://github.com/abs-tudelft/Tydi-Chisel</uri>\u0000.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 12","pages":"2281-2292"},"PeriodicalIF":2.8,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142821154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SPEED: A Scalable RISC-V Vector Processor Enabling Efficient Multiprecision DNN Inference 速度:一个可扩展的RISC-V矢量处理器,实现高效的多精度DNN推理
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-10-04 DOI: 10.1109/TVLSI.2024.3466224
Chuanning Wang;Chao Fang;Xiao Wu;Zhongfeng Wang;Jun Lin
{"title":"SPEED: A Scalable RISC-V Vector Processor Enabling Efficient Multiprecision DNN Inference","authors":"Chuanning Wang;Chao Fang;Xiao Wu;Zhongfeng Wang;Jun Lin","doi":"10.1109/TVLSI.2024.3466224","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3466224","url":null,"abstract":"Deploying deep neural networks (DNNs) on those resource-constrained edge platforms is hindered by their substantial computation and storage demands. Quantized multiprecision DNNs (MP-DNNs), denoted as MP-DNNs, offer a promising solution for these limitations but pose challenges for the existing RISC-V processors due to complex instructions, suboptimal parallel processing, and inefficient dataflow mapping. To tackle the challenges mentioned above, SPEED, a scalable RISC-V vector (RVV) processor, is proposed to enable efficient MP-DNN inference, incorporating innovations in customized instructions, hardware architecture, and dataflow mapping. First, some dedicated customized RISC-V instructions are introduced based on RVV extensions to reduce the instruction complexity, allowing SPEED to support processing precision ranging from 4- to 16-bit with minimized hardware overhead. Second, a parameterized multiprecision tensor unit (MPTU) is developed and integrated within the scalable module to enhance parallel processing capability by providing reconfigurable parallelism that matches the computation patterns of diverse MP-DNNs. Finally, a flexible mixed dataflow method is adopted to improve computational and energy efficiency according to the computing patterns of different DNN operators. The synthesis of SPEED is conducted on TSMC 28-nm technology. Experimental results show that SPEED achieves a peak throughput of 737.9 GOPS and an energy efficiency of 1383.4 GOPS/W for 4-bit operators. Furthermore, SPEED exhibits superior area efficiency compared with prior RVV processors, with the enhancements of \u0000<inline-formula> <tex-math>$5.9sim 26.9times $ </tex-math></inline-formula>\u0000 and \u0000<inline-formula> <tex-math>$8.2sim 18.5times $ </tex-math></inline-formula>\u0000 for 8-bit operator and best integer performance, respectively, which highlights SPEED’s significant potential for efficient MP-DNN inference.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 1","pages":"207-220"},"PeriodicalIF":2.8,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-Power and High-Speed SRAM Cells With Double-Node Upset Self-Recovery for Reliable Applications 低功耗和高速SRAM单元与双节点破坏自恢复可靠的应用
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-10-03 DOI: 10.1109/TVLSI.2024.3466897
Shuo Cai;Xinjie Liang;Zhu Huang;Weizheng Wang;Fei Yu
{"title":"Low-Power and High-Speed SRAM Cells With Double-Node Upset Self-Recovery for Reliable Applications","authors":"Shuo Cai;Xinjie Liang;Zhu Huang;Weizheng Wang;Fei Yu","doi":"10.1109/TVLSI.2024.3466897","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3466897","url":null,"abstract":"Transistor sizing and spacing are constantly decreasing due to the continuous advancement of CMOS technology. The charge of the sensitive nodes in the static random access memory (SRAM) cell gradually decreases, making the SRAM cell more and more sensitive to soft errors, such as single-node upsets (SNUs) and double-node upsets (DNUs). Therefore, two types of radiation-hardened SRAM cells are proposed in this article. First, a low-power DNU self-recovery S6P8N cell is proposed. This cell can realize SNU self-recovery from all sensitive nodes as well as realize partial DNUs self-recovery and has low-power consumption overhead. Second, we propose a high-speed DNU self-recovery S8P6N cell, which has a soft-error tolerance level similar to the S6P8N. Furthermore, it reduces the read access time (RAT) and write access time (WAT). Simulation results show that the proposed cells are self-recovery for all SNUs and most of DNUs. Compared with RHD12, QCCM12T, QUCCE12T, RHMD10T, SEA14T, RHM-12T, S4P8N, S8P4N, RH-14T, HRLP16T, CC18T, and RHM, the average power consumption of S6P8N is reduced by 48.78%, and the average WAT is reduced by 6.62%. While the average power consumption of S8P6N is reduced by 23.64%, and the average WAT and RAT by 9.07% and 36.84%, respectively.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"475-487"},"PeriodicalIF":2.8,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 9.6-nW Wake-Up Timer With RC-Referenced Subharmonic Locking Using Dual Leakage-Based Oscillators 基于双泄漏振荡器的带rc参考次谐波锁定的9.6 nw唤醒定时器
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-10-03 DOI: 10.1109/TVLSI.2024.3466850
Jahyun Koo;Hyunwoo Son;Jae-Yoon Sim
{"title":"A 9.6-nW Wake-Up Timer With RC-Referenced Subharmonic Locking Using Dual Leakage-Based Oscillators","authors":"Jahyun Koo;Hyunwoo Son;Jae-Yoon Sim","doi":"10.1109/TVLSI.2024.3466850","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3466850","url":null,"abstract":"This brief presents a nano-watt wake-up timer implemented mainly through digital synthesis. By performing successive subharmonic frequency locks between two leakage-based digitally controlled oscillators (DCOs) and repeatedly switching their roles, the period of the timer can be locked to a scaled RC time, enabling low-frequency generation without the need for substantial RC values. The proposed frequency-lock scheme is applied to design a 360 Hz timer. The implemented timer in a 0.18-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>m CMOS process consumes 9.6 nW and shows a standard deviation of 1.36% without the need for extensive external trimming, mainly due to intra-wafer process variation. The measured supply and temperature sensitivities are 0.32%/V and 395 ppm/°C, respectively.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"598-602"},"PeriodicalIF":2.8,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PIPECIM: Energy-Efficient Pipelined Computing-in-Memory Computation Engine With Sparsity-Aware Technique PIPECIM:基于稀疏感知技术的高效内存管道计算引擎
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-10-01 DOI: 10.1109/TVLSI.2024.3462507
Yuanbo Wang;Liang Chang;Jingke Wang;Pan Zhao;Jiahao Zeng;Xin Zhao;Wuyang Hao;Liang Zhou;Haining Tan;Yinhe Han;Jun Zhou
{"title":"PIPECIM: Energy-Efficient Pipelined Computing-in-Memory Computation Engine With Sparsity-Aware Technique","authors":"Yuanbo Wang;Liang Chang;Jingke Wang;Pan Zhao;Jiahao Zeng;Xin Zhao;Wuyang Hao;Liang Zhou;Haining Tan;Yinhe Han;Jun Zhou","doi":"10.1109/TVLSI.2024.3462507","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3462507","url":null,"abstract":"Computing-in-memory (CIM) architecture has become a promising solution to improve the parallelism of the multiply-and-accumulation (MAC) operation for artificial intelligence (AI) processors. Recently, revived CIM engine partly relieves the memory wall issue by integrating computation in/with the memory. However, current CIM solutions still require large data movements with the increase of the practical neural network model and massive input data. Previous CIM works only considered computation without concern for the memory attribute, leading to a low memory computing ratio. This article presents a static-random access-memory (SRAM)-based digital CIM macro supporting pipeline mode and computation-memory-aware technique to improve the memory computing ratio. We develop a novel weight driver with fine-grained ping-pong operation, avoiding the computation stall caused by weight update. Based on our evaluation, the peak energy efficiency is 19.78 TOPS/W at the 22-nm technology node, 8-bit width, and 50% sparsity of the input feature map.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"525-536"},"PeriodicalIF":2.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Highly Defect Detectable and SEU-Resilient Robust Scan-Test-Aware Latch Design 高缺陷可检测和seu弹性鲁棒扫描测试感知锁存器设计
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-10-01 DOI: 10.1109/TVLSI.2024.3467089
Ruijun Ma;Stefan Holst;Hui Xu;Xiaoqing Wen;Senling Wang;Jiuqi Li;Aibin Yan
{"title":"Highly Defect Detectable and SEU-Resilient Robust Scan-Test-Aware Latch Design","authors":"Ruijun Ma;Stefan Holst;Hui Xu;Xiaoqing Wen;Senling Wang;Jiuqi Li;Aibin Yan","doi":"10.1109/TVLSI.2024.3467089","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3467089","url":null,"abstract":"Soft errors have been a severe threat to the reliability of modern integrated circuits (ICs), making hardened latch designs indispensable for masking soft errors with redundancy. However, the added redundancy also masks production defects as soft errors; this makes it hard to detect defects in hardened latches, thus significantly reducing their reliability. Our previous work proposed the scan-test-aware hardened latch (STAHL) design, the first for addressing the issue of low defect detectability of hardened latch designs. However, STAHL still suffers from two problems: 1) it is not self-resilient to soft errors and 2) a STAHL-based scan design requires one additional control signal. This article proposes a high defect detectable and single-event-upset (SEU)-resilient robust (HIDER) latch to address the issues of the low defect detectability of existing hardened latches and the STAHLs lack of SEU-resilient capability. Two scan designs [HIDER-based scan-cell-S (HIDER-SC-S) and HIDER-based scan-cell-F (HIDER-SC-F)], as well as two corresponding test procedures, are proposed to fully test HIDER latch with only one control signal. Simulation results show that the HIDER latch achieves the highest defect coverage (DC) in both single latch cell detection and scan tests among all existing hardened latch designs. In addition, the HIDER latch has much lower power and a smaller delay than STAHL.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"449-461"},"PeriodicalIF":2.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An End-to-End Bundled-Data Asynchronous Circuits Design Flow: From RTL to GDS 端到端捆绑数据异步电路设计流程:从RTL到GDS
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-10-01 DOI: 10.1109/TVLSI.2024.3464870
Jinghai Wang;Shanlin Xiao;Jilong Luo;Bo Li;Lingfeng Zhou;Zhiyi Yu
{"title":"An End-to-End Bundled-Data Asynchronous Circuits Design Flow: From RTL to GDS","authors":"Jinghai Wang;Shanlin Xiao;Jilong Luo;Bo Li;Lingfeng Zhou;Zhiyi Yu","doi":"10.1109/TVLSI.2024.3464870","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3464870","url":null,"abstract":"Asynchronous circuits with low power and robustness are revived in emerging applications such as the Internet of Things (IoT) and neuromorphic chips, thanks to clock-less and event-driven mechanisms. However, the lack of mature computer-aided design (CAD) tools for designing large-scale asynchronous circuits results in low design efficiency and high cost. This article proposes an end-to-end bundled-data (BD) asynchronous circuit design flow, which can facilitate building asynchronous circuits, even if the designer has little or no asynchronous circuit foundation. Three features that enable this are: 1) a lightweight circuit converter developed in Python can convert circuits from synchronous descriptions to corresponding asynchronous ones at register transfer level (RTL). Desynchronization flow helps designers maintain a “synchronization mentality” to construct asynchronous circuits; 2) a synchronization-like verification method is proposed for asynchronous circuits so that it can be functionally verified before synthesis. Avoids the risk of rework after logic defects are discovered during the synthesis and implementation, as asynchronous circuits often cannot be simulated until gate-level (GL) netlist generation; and 3) the whole implementation flow from RTL to graphic data system (GDS) is based on commercial electronic design automation (EDA) tools. Similar to the design flow of synchronous circuits, it helps designers implement asynchronous circuits with “synchronization habits.” Furthermore, to validate this methodology, two asynchronous processors were, respectively, implemented and evaluated in the TSMC 28-nm CMOS process. Compared to their synchronous counterparts, the general-purpose asynchronous RISC-V processor achieves 20.5% power savings. And the domain-specific asynchronous spiking neural network (SNN) accelerator achieves 58.46% power savings and \u0000<inline-formula> <tex-math>$2.41times $ </tex-math></inline-formula>\u0000 energy efficiency improvement at 70% input spike sparsity.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 1","pages":"154-167"},"PeriodicalIF":2.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis and Design of Wideband GaAs Digital Step Attenuators 宽带GaAs数字阶跃衰减器的分析与设计
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-09-30 DOI: 10.1109/TVLSI.2024.3461715
Quanzhen Liang;Xiao Wang;Kuisong Wang;Yuepeng Yan;Xiaoxin Liang
{"title":"Analysis and Design of Wideband GaAs Digital Step Attenuators","authors":"Quanzhen Liang;Xiao Wang;Kuisong Wang;Yuepeng Yan;Xiaoxin Liang","doi":"10.1109/TVLSI.2024.3461715","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3461715","url":null,"abstract":"This brief analyses the causes of amplitude and phase errors in digital step attenuators (DSAs), and proposes two novel structures, namely, the series inductive compensation structure (SICS) and the small-bit compensation structure, to reduce these two kinds of errors. A 6-bit DSA with ultrawideband, low insertion loss, and high accuracy is presented, which has an area of only 0.51 mm2 and shows an attenuation range of 31.5 dB in 0.5 dB steps. Measurements reveal that the root-mean-square (rms) amplitude and phase errors for the 64 attenuation states are within 0.18 dB and 8°, respectively. The insertion loss is better than 2.54 dB, and the input 1 dB compression point (IP1 dB) is better than 29 dBm. To the best of our knowledge, this chip presents the highest attenuation accuracy, the lowest insertion loss, the best IP1 dB, and a good matching performance in the range of 2–22 GHz using the 0.25-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>m GaAs p-HEMT process.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"583-587"},"PeriodicalIF":2.8,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Compact 0.9μ W Direct-Conversion Frequency Analyzer for Speech Recognition With Wide- Range Q-Controllable Bandpass Rectifier 基于宽范围q可控带通整流器的0.9μ W语音识别直接转换频率分析仪
IF 2.8 2区 工程技术
IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-09-26 DOI: 10.1109/TVLSI.2024.3453314
Shiro Dosho;Ludovico Minati;Kazuki Maari;Shungo Ohkubo;Hiroyuki Ito
{"title":"A Compact 0.9μ W Direct-Conversion Frequency Analyzer for Speech Recognition With Wide- Range Q-Controllable Bandpass Rectifier","authors":"Shiro Dosho;Ludovico Minati;Kazuki Maari;Shungo Ohkubo;Hiroyuki Ito","doi":"10.1109/TVLSI.2024.3453314","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3453314","url":null,"abstract":"The development of ultralow-power analog front ends for edge artificial intelligence (AI) is actively pursued; however, these front ends suffer from low-frequency selection accuracy, leading to increased training loads for the AI components and higher testing costs. In this article, we propose a novel circuit that fundamentally addresses these issues through direct conversion. By re-evaluating the circuit configurations of the multiplier, harmonic removal filter, and full-wave rectifier (FWR) from scratch, we have miniaturized and integrated an ultralow-power converter that transforms frequency components into pulse sequences. The frequency to be analyzed is determined by the local frequency input to the multiplier, which can be digitally controlled with high precision. In our system, the Q value is adaptively adjusted by the local frequency of the direct conversion, allowing the same circuit configuration to be applied to all frequency nodes, eliminating the need for filter design for each node and providing a highly design-friendly and scalable frequency analysis system.The test chip was fabricated with a 0.18-<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>m process, operating at a 1.2-V supply, and outputting power pulse streams corresponding to 11 different frequencies ranging from 500 to 5 kHz. The total operating power was <inline-formula> <tex-math>$0.9mu $ </tex-math></inline-formula>W, with an achieved equivalent Q factor ranging from 3.6 to 36. In a training experiment using a convolutional neural network (CNN) speech recognition model constructed with a functional model equivalent to this front end, a recognition rate exceeding 80% was achieved, demonstrating the practicality of this front end.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"315-325"},"PeriodicalIF":2.8,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10695034","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信