Convolutive Bottleneck Network features for LVCSR

2011 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2011-12-01 DOI:10.1109/ASRU.2011.6163903

Karel Veselý, M. Karafiát, F. Grézl

引用次数: 108

Abstract

In this paper, we focus on improvements of the bottleneck ANN in a Tandem LVCSR system. First, the influence of training set size and the ANN size is evaluated. Second, a very positive effect of linear bottleneck is shown. Finally a Convolutive Bottleneck Network is proposed as extension of the current state-of-the-art Universal Context Network. The proposed training method leads to 5.5% relative reduction of WER, compared to the Universal Context ANN baseline. The relative improvement compared to the 5-layer single-bottleneck network is 17.7%. The dataset ctstrain07 composed of more than 2000 hours of English Conversational Telephone Speech was used for the experiments. The TNet toolkit with CUDA GPGPU implementation was used for fast training.

查看原文本刊更多论文

LVCSR的卷积瓶颈网络特征

在本文中，我们重点研究了瓶颈人工神经网络在串联LVCSR系统中的改进。首先，评估了训练集大小和人工神经网络大小的影响。其次，显示了线性瓶颈的非常积极的影响。最后提出了一种卷积瓶颈网络，作为当前最先进的通用上下文网络的扩展。与通用上下文人工神经网络基线相比，所提出的训练方法导致WER相对降低5.5%。与5层单瓶颈网络相比，相对改进了17.7%。实验使用了由2000多个小时的英语会话电话语音组成的数据集ctstrain07。使用TNet工具包与CUDA GPGPU实现进行快速训练。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 IEEE Workshop on Automatic Speech Recognition & Understanding

自引率

0.00%

发文量