An End-to-End Model Based on TDNN-BiGRU for Keyword Spotting

2019 International Conference on Asian Language Processing (IALP) Pub Date : 2019-11-01 DOI:10.1109/IALP48816.2019.9037714

Shuzhou Chai, Zhenye Yang, Changsheng Lv, Weiqiang Zhang

引用次数: 2

Abstract

In this paper, we proposed a neural network architecture based on Time-Delay Neural Network (TDNN)Bidirectional Gated Recurrent Unit (BiGRU) for small-footprint keyWord spotting. Our model consists of three parts: TDNN, BiGRU and Attention Mechanism. TDNN models the time information and BiGRU extracts the hidden layer features of the audio. The attention mechanism generates a vector of fixed length with hidden layer features. The system generates the final score through vector linear transformation and softmax function. We explored the step size and unit size of TDNN and two attention mechanisms. Our model has achieved a true positive rate of 99.63% at a 5% false positive rate.

查看原文本刊更多论文

基于TDNN-BiGRU的端到端关键字识别模型

本文提出了一种基于时延神经网络(TDNN)双向门控循环单元(BiGRU)的神经网络结构，用于小空间关键字识别。我们的模型由三部分组成:TDNN、BiGRU和注意机制。TDNN对时间信息进行建模，BiGRU提取音频的隐藏层特征。注意机制生成具有隐层特征的固定长度向量。系统通过向量线性变换和softmax函数生成最终分数。我们探讨了TDNN的步长和单位大小以及两种注意机制。我们的模型在5%的假阳性率下实现了99.63%的真阳性率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 International Conference on Asian Language Processing (IALP)

自引率

0.00%

发文量