Convolutional Neural Networks for Deep Spoken Keyword Spotting

2023 3rd International Conference on Artificial Intelligence (ICAI) Pub Date : 2023-02-22 DOI:10.1109/ICAI58407.2023.10136648

Nayyer Aafaq, Mehran Saleem, Jahanzeb Tariq Khan, I. Abbasi

引用次数: 0

Abstract

With the increase in biometric security applications, mobile and telephonic communication monitoring and digital assistants, the practical applications of Keyword Spotting (KWS) have increased many folds. The use of Artificial Intelligence in the domain of Keyword Spotting has greatly enhanced its accuracy. In this work, after doing analysis of various feature extraction and Deep Learning techniques, KWS is done both in non-streaming mode and streaming mode. The features of the speech are extracted using Mel-Spectograms and Mel-frequency Cepstral Coefficients (MFCCs). Out of three broad categories of Deep Neural networks, Convolutional Neural Network (CNN) model has been implemented for Keyword Spotting as it out-performs Recurrent Neural Network (RNN) and Feedforward Neural Network (FFNN) due to their lesser complexity and low computational cost. These techniques were used with Google Speech Commands Dataset, provided by Google, online as well as offline.

查看原文本刊更多论文

深度语音关键字识别的卷积神经网络

随着生物识别安全应用、移动和电话通信监控以及数字助理的增加，关键词识别(KWS)的实际应用增加了许多倍。人工智能在关键词识别领域的应用大大提高了关键词识别的准确性。在本工作中，在分析了各种特征提取和深度学习技术之后，KWS在非流模式和流模式下进行了研究。使用mel谱和mel频率倒谱系数(MFCCs)提取语音特征。在深度神经网络的三大类中，卷积神经网络(CNN)模型已被用于关键字识别，因为它优于循环神经网络(RNN)和前馈神经网络(FFNN)，因为它们的复杂性较低，计算成本较低。这些技术被用于谷歌语音命令数据集，由谷歌提供，在线和离线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 3rd International Conference on Artificial Intelligence (ICAI)

自引率

0.00%

发文量