基于对称压缩三权神经网络的3.8 μ w 10关键字噪声鲁棒关键字识别处理器

IF 3.2

IEEE Open Journal of the Solid-State Circuits Society Pub Date : 2023-09-06 DOI:10.1109/OJSSCS.2023.3312354

Bo Liu;Na Xie;Renyuan Zhang;Haichuan Yang;Ziyu Wang;Deliang Fan;Zhen Wang;Weiqiang Liu;Hao Cai

{"title":"基于对称压缩三权神经网络的3.8 μ w 10关键字噪声鲁棒关键字识别处理器","authors":"Bo Liu;Na Xie;Renyuan Zhang;Haichuan Yang;Ziyu Wang;Deliang Fan;Zhen Wang;Weiqiang Liu;Hao Cai","doi":"10.1109/OJSSCS.2023.3312354","DOIUrl":null,"url":null,"abstract":"A ternary-weight neural network (TWN) inspired keyword spotting (KWS) processor is proposed to support complicated and variable application scenarios. To achieve high-precision recognition of ten keywords under 5 dB~Clean wide range of background noises, a convolution neural network consists of four convolution layers and four fully connected layers, with modified sparsity-controllable truncated Gaussian approximation-based ternary-weight training is used. End-to-end optimization composed of three techniques is utilized: 1) the stage-by-stage bit-width selection algorithm to optimize the hardware overhead of FFT; 2) the lossy compressed TWN with symmetric kernel training (SKT) and dedicated internal data reuse computation flow; and 3) the error intercompensation approximate addition tree to reduce the computation overhead with marginal accuracy loss. Fabricated in an industrial 22-nm CMOS process, the processor realizes up to ten keywords in real-time recognition under 11 background noise types, with the accuracy of 90.6%@clean and 85.4%@5 dB. It consumes an average power of \n<inline-formula> <tex-math>$3.8 ~\\mu \\text{W}$ </tex-math></inline-formula>\n at 250 kHz and the normalized energy efficiency is \n<inline-formula> <tex-math>$2.79\\times $ </tex-math></inline-formula>\n higher than state of the art.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"3 ","pages":"185-196"},"PeriodicalIF":3.2000,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10242041","citationCount":"0","resultStr":"{\"title\":\"A 3.8-μW 10-Keyword Noise-Robust Keyword Spotting Processor Using Symmetric Compressed Ternary-Weight Neural Networks\",\"authors\":\"Bo Liu;Na Xie;Renyuan Zhang;Haichuan Yang;Ziyu Wang;Deliang Fan;Zhen Wang;Weiqiang Liu;Hao Cai\",\"doi\":\"10.1109/OJSSCS.2023.3312354\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A ternary-weight neural network (TWN) inspired keyword spotting (KWS) processor is proposed to support complicated and variable application scenarios. To achieve high-precision recognition of ten keywords under 5 dB~Clean wide range of background noises, a convolution neural network consists of four convolution layers and four fully connected layers, with modified sparsity-controllable truncated Gaussian approximation-based ternary-weight training is used. End-to-end optimization composed of three techniques is utilized: 1) the stage-by-stage bit-width selection algorithm to optimize the hardware overhead of FFT; 2) the lossy compressed TWN with symmetric kernel training (SKT) and dedicated internal data reuse computation flow; and 3) the error intercompensation approximate addition tree to reduce the computation overhead with marginal accuracy loss. Fabricated in an industrial 22-nm CMOS process, the processor realizes up to ten keywords in real-time recognition under 11 background noise types, with the accuracy of 90.6%@clean and 85.4%@5 dB. It consumes an average power of \\n<inline-formula> <tex-math>$3.8 ~\\\\mu \\\\text{W}$ </tex-math></inline-formula>\\n at 250 kHz and the normalized energy efficiency is \\n<inline-formula> <tex-math>$2.79\\\\times $ </tex-math></inline-formula>\\n higher than state of the art.\",\"PeriodicalId\":100633,\"journal\":{\"name\":\"IEEE Open Journal of the Solid-State Circuits Society\",\"volume\":\"3 \",\"pages\":\"185-196\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2023-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10242041\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of the Solid-State Circuits Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10242041/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Solid-State Circuits Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10242041/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

针对复杂多变的应用场景，提出了一种基于三权神经网络(TWN)的关键词识别处理器。为了实现5 dB~Clean大范围背景噪声下10个关键词的高精度识别，采用改进稀疏可控截断高斯近似的三权训练方法，构建了由4个卷积层和4个全连通层组成的卷积神经网络。采用三种技术组成的端到端优化:1)采用逐级位宽选择算法优化FFT的硬件开销;2)具有对称核训练(SKT)和专用内部数据重用计算流的有损压缩TWN;3)误差间补偿近似加法树，以减少计算量和边际精度损失。该处理器采用工业22nm CMOS工艺制造，可在11种背景噪声下实现多达10个关键词的实时识别，准确率为90.6%@clean, 85.4%@5 dB。它在250 kHz时的平均功耗为3.8 ~\mu \text{W}$，标准化的能源效率比目前的技术水平高2.79\ $。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A 3.8-μW 10-Keyword Noise-Robust Keyword Spotting Processor Using Symmetric Compressed Ternary-Weight Neural Networks

A ternary-weight neural network (TWN) inspired keyword spotting (KWS) processor is proposed to support complicated and variable application scenarios. To achieve high-precision recognition of ten keywords under 5 dB~Clean wide range of background noises, a convolution neural network consists of four convolution layers and four fully connected layers, with modified sparsity-controllable truncated Gaussian approximation-based ternary-weight training is used. End-to-end optimization composed of three techniques is utilized: 1) the stage-by-stage bit-width selection algorithm to optimize the hardware overhead of FFT; 2) the lossy compressed TWN with symmetric kernel training (SKT) and dedicated internal data reuse computation flow; and 3) the error intercompensation approximate addition tree to reduce the computation overhead with marginal accuracy loss. Fabricated in an industrial 22-nm CMOS process, the processor realizes up to ten keywords in real-time recognition under 11 background noise types, with the accuracy of 90.6%@clean and 85.4%@5 dB. It consumes an average power of

$3.8 ~\mu \text{W}$

at 250 kHz and the normalized energy efficiency is

$2.79\times $

higher than state of the art.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Open Journal of the Solid-State Circuits Society

自引率

0.00%

发文量