Convolutional Neural Networks for Deep Spoken Keyword Spotting

Nayyer Aafaq, Mehran Saleem, Jahanzeb Tariq Khan, I. Abbasi
{"title":"Convolutional Neural Networks for Deep Spoken Keyword Spotting","authors":"Nayyer Aafaq, Mehran Saleem, Jahanzeb Tariq Khan, I. Abbasi","doi":"10.1109/ICAI58407.2023.10136648","DOIUrl":null,"url":null,"abstract":"With the increase in biometric security applications, mobile and telephonic communication monitoring and digital assistants, the practical applications of Keyword Spotting (KWS) have increased many folds. The use of Artificial Intelligence in the domain of Keyword Spotting has greatly enhanced its accuracy. In this work, after doing analysis of various feature extraction and Deep Learning techniques, KWS is done both in non-streaming mode and streaming mode. The features of the speech are extracted using Mel-Spectograms and Mel-frequency Cepstral Coefficients (MFCCs). Out of three broad categories of Deep Neural networks, Convolutional Neural Network (CNN) model has been implemented for Keyword Spotting as it out-performs Recurrent Neural Network (RNN) and Feedforward Neural Network (FFNN) due to their lesser complexity and low computational cost. These techniques were used with Google Speech Commands Dataset, provided by Google, online as well as offline.","PeriodicalId":161809,"journal":{"name":"2023 3rd International Conference on Artificial Intelligence (ICAI)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 3rd International Conference on Artificial Intelligence (ICAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAI58407.2023.10136648","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With the increase in biometric security applications, mobile and telephonic communication monitoring and digital assistants, the practical applications of Keyword Spotting (KWS) have increased many folds. The use of Artificial Intelligence in the domain of Keyword Spotting has greatly enhanced its accuracy. In this work, after doing analysis of various feature extraction and Deep Learning techniques, KWS is done both in non-streaming mode and streaming mode. The features of the speech are extracted using Mel-Spectograms and Mel-frequency Cepstral Coefficients (MFCCs). Out of three broad categories of Deep Neural networks, Convolutional Neural Network (CNN) model has been implemented for Keyword Spotting as it out-performs Recurrent Neural Network (RNN) and Feedforward Neural Network (FFNN) due to their lesser complexity and low computational cost. These techniques were used with Google Speech Commands Dataset, provided by Google, online as well as offline.
深度语音关键字识别的卷积神经网络
随着生物识别安全应用、移动和电话通信监控以及数字助理的增加,关键词识别(KWS)的实际应用增加了许多倍。人工智能在关键词识别领域的应用大大提高了关键词识别的准确性。在本工作中,在分析了各种特征提取和深度学习技术之后,KWS在非流模式和流模式下进行了研究。使用mel谱和mel频率倒谱系数(MFCCs)提取语音特征。在深度神经网络的三大类中,卷积神经网络(CNN)模型已被用于关键字识别,因为它优于循环神经网络(RNN)和前馈神经网络(FFNN),因为它们的复杂性较低,计算成本较低。这些技术被用于谷歌语音命令数据集,由谷歌提供,在线和离线。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信