14.1基于串行fft的MFCC和二值化深度可分离卷积神经网络的510nW低存储低计算关键字定位芯片

Weiwei Shan, Minhao Yang, Jiaming Xu, Yicheng Lu, Shuai Zhang, Tao Wang, Jun Yang, Longxing Shi, Mingoo Seok
{"title":"14.1基于串行fft的MFCC和二值化深度可分离卷积神经网络的510nW低存储低计算关键字定位芯片","authors":"Weiwei Shan, Minhao Yang, Jiaming Xu, Yicheng Lu, Shuai Zhang, Tao Wang, Jun Yang, Longxing Shi, Mingoo Seok","doi":"10.1109/ISSCC19947.2020.9063000","DOIUrl":null,"url":null,"abstract":"Ultra-low power is a strong requirement for always-on speech interfaces in wearable and mobile devices, such as Voice Activity Detection (VAD) and Keyword Spotting (KWS) [1]–[5]. A KWS system is used to detect specific wake-up words by speakers and has to be always on. Previous ASICs for KWS lack energy-efficient implementations having power $< 5\\mu \\mathrm{W}$. For example, deep neural network (DNN)-based KWS [1] has a large on-chip weight memory of 270KB and consumes $288\\mu \\mathrm{W}$. A binarized convolutional neural network (CNN) used 52KB of SRAM, $141\\mu \\mathrm{W}$ wakeup power at 2.5MHz, 0.57V [2]. An LSTM-based SoC used 105KB of SRAM and reduced power to $16.11\\mu\\mathrm{W}$ for KWS with 90.8% accuracy on the Google Speech Command Dataset (GSCD) [3]. Laika reduced power to $5\\mu \\mathrm{W}$ [4], not including the Mel Frequency Cepstrum Coefficient (MFCC) circuit. High compute and memory requirements have prevented always-on KWS chips from operating in the $\\mathrm{sub}-\\mu \\mathrm{W}$ range.","PeriodicalId":178871,"journal":{"name":"2020 IEEE International Solid- State Circuits Conference - (ISSCC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":"{\"title\":\"14.1 A 510nW 0.41V Low-Memory Low-Computation Keyword-Spotting Chip Using Serial FFT-Based MFCC and Binarized Depthwise Separable Convolutional Neural Network in 28nm CMOS\",\"authors\":\"Weiwei Shan, Minhao Yang, Jiaming Xu, Yicheng Lu, Shuai Zhang, Tao Wang, Jun Yang, Longxing Shi, Mingoo Seok\",\"doi\":\"10.1109/ISSCC19947.2020.9063000\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ultra-low power is a strong requirement for always-on speech interfaces in wearable and mobile devices, such as Voice Activity Detection (VAD) and Keyword Spotting (KWS) [1]–[5]. A KWS system is used to detect specific wake-up words by speakers and has to be always on. Previous ASICs for KWS lack energy-efficient implementations having power $< 5\\\\mu \\\\mathrm{W}$. For example, deep neural network (DNN)-based KWS [1] has a large on-chip weight memory of 270KB and consumes $288\\\\mu \\\\mathrm{W}$. A binarized convolutional neural network (CNN) used 52KB of SRAM, $141\\\\mu \\\\mathrm{W}$ wakeup power at 2.5MHz, 0.57V [2]. An LSTM-based SoC used 105KB of SRAM and reduced power to $16.11\\\\mu\\\\mathrm{W}$ for KWS with 90.8% accuracy on the Google Speech Command Dataset (GSCD) [3]. Laika reduced power to $5\\\\mu \\\\mathrm{W}$ [4], not including the Mel Frequency Cepstrum Coefficient (MFCC) circuit. High compute and memory requirements have prevented always-on KWS chips from operating in the $\\\\mathrm{sub}-\\\\mu \\\\mathrm{W}$ range.\",\"PeriodicalId\":178871,\"journal\":{\"name\":\"2020 IEEE International Solid- State Circuits Conference - (ISSCC)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"30\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Solid- State Circuits Conference - (ISSCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSCC19947.2020.9063000\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Solid- State Circuits Conference - (ISSCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSCC19947.2020.9063000","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30

摘要

超低功耗是可穿戴设备和移动设备中始终在线的语音接口的强烈要求,例如语音活动检测(Voice Activity Detection, VAD)和关键字识别(Keyword Spotting, KWS)[1] -[5]。KWS系统用于检测说话者的特定唤醒词,并且必须始终处于开启状态。以前用于KWS的asic缺乏功耗$< 5\mu \ mathm {W}$的节能实现。例如,基于深度神经网络(deep neural network, DNN)的KWS[1]具有270KB的大片上权重内存,消耗$288\mu \mathrm{W}$。二值化卷积神经网络(CNN)使用52KB的SRAM, $141\mu \ mathm {W}$唤醒功率为2.5MHz, 0.57V[2]。基于lstm的SoC使用105KB的SRAM并将功耗降至16.11\mu\ mathm {W}$,在Google语音命令数据集(GSCD)上具有90.8%的准确率[3]。莱卡将功耗降低到$5\mu \ mathm {W}$[4],不包括Mel频率倒频谱系数(MFCC)电路。高计算和内存要求使始终在线的KWS芯片无法在$\ mathm {sub}-\mu \ mathm {W}$范围内运行。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
14.1 A 510nW 0.41V Low-Memory Low-Computation Keyword-Spotting Chip Using Serial FFT-Based MFCC and Binarized Depthwise Separable Convolutional Neural Network in 28nm CMOS
Ultra-low power is a strong requirement for always-on speech interfaces in wearable and mobile devices, such as Voice Activity Detection (VAD) and Keyword Spotting (KWS) [1]–[5]. A KWS system is used to detect specific wake-up words by speakers and has to be always on. Previous ASICs for KWS lack energy-efficient implementations having power $< 5\mu \mathrm{W}$. For example, deep neural network (DNN)-based KWS [1] has a large on-chip weight memory of 270KB and consumes $288\mu \mathrm{W}$. A binarized convolutional neural network (CNN) used 52KB of SRAM, $141\mu \mathrm{W}$ wakeup power at 2.5MHz, 0.57V [2]. An LSTM-based SoC used 105KB of SRAM and reduced power to $16.11\mu\mathrm{W}$ for KWS with 90.8% accuracy on the Google Speech Command Dataset (GSCD) [3]. Laika reduced power to $5\mu \mathrm{W}$ [4], not including the Mel Frequency Cepstrum Coefficient (MFCC) circuit. High compute and memory requirements have prevented always-on KWS chips from operating in the $\mathrm{sub}-\mu \mathrm{W}$ range.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信