基于深度神经网络的鲁棒小足迹关键词识别的自动增益控制和多风格训练

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2015-04-19 DOI:10.1109/ICASSP.2015.7178863

Rohit Prabhavalkar, R. Álvarez, Carolina Parada, Preetum Nakkiran, Tara N. Sainath

{"title":"基于深度神经网络的鲁棒小足迹关键词识别的自动增益控制和多风格训练","authors":"Rohit Prabhavalkar, R. Álvarez, Carolina Parada, Preetum Nakkiran, Tara N. Sainath","doi":"10.1109/ICASSP.2015.7178863","DOIUrl":null,"url":null,"abstract":"We explore techniques to improve the robustness of small-footprint keyword spotting models based on deep neural networks (DNNs) in the presence of background noise and in far-field conditions. We find that system performance can be improved significantly, with relative improvements up to 75% in far-field conditions, by employing a combination of multi-style training and a proposed novel formulation of automatic gain control (AGC) that estimates the levels of both speech and background noise. Further, we find that these techniques allow us to achieve competitive performance, even when applied to DNNs with an order of magnitude fewer parameters than our base-line.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"73","resultStr":"{\"title\":\"Automatic gain control and multi-style training for robust small-footprint keyword spotting with deep neural networks\",\"authors\":\"Rohit Prabhavalkar, R. Álvarez, Carolina Parada, Preetum Nakkiran, Tara N. Sainath\",\"doi\":\"10.1109/ICASSP.2015.7178863\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We explore techniques to improve the robustness of small-footprint keyword spotting models based on deep neural networks (DNNs) in the presence of background noise and in far-field conditions. We find that system performance can be improved significantly, with relative improvements up to 75% in far-field conditions, by employing a combination of multi-style training and a proposed novel formulation of automatic gain control (AGC) that estimates the levels of both speech and background noise. Further, we find that these techniques allow us to achieve competitive performance, even when applied to DNNs with an order of magnitude fewer parameters than our base-line.\",\"PeriodicalId\":117666,\"journal\":{\"name\":\"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"73\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2015.7178863\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2015.7178863","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 73

摘要

我们探索了在存在背景噪声和远场条件下提高基于深度神经网络(dnn)的小足迹关键字识别模型的鲁棒性的技术。我们发现，通过结合多风格训练和提出的新型自动增益控制(AGC)公式来估计语音和背景噪声的水平，系统性能可以显著提高，在远场条件下相对提高高达75%。此外，我们发现这些技术使我们能够获得具有竞争力的性能，即使应用于参数比基线少一个数量级的dnn。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automatic gain control and multi-style training for robust small-footprint keyword spotting with deep neural networks

We explore techniques to improve the robustness of small-footprint keyword spotting models based on deep neural networks (DNNs) in the presence of background noise and in far-field conditions. We find that system performance can be improved significantly, with relative improvements up to 75% in far-field conditions, by employing a combination of multi-style training and a proposed novel formulation of automatic gain control (AGC) that estimates the levels of both speech and background noise. Further, we find that these techniques allow us to achieve competitive performance, even when applied to DNNs with an order of magnitude fewer parameters than our base-line.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

自引率

0.00%

发文量