Frame-level AnyBoost for LVCSR with the MMI Criterion

2011 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2011-12-01 DOI:10.1109/ASRU.2011.6163897

Ryuki Tachibana, Takashi Fukuda, U. Chaudhari, B. Ramabhadran, P. Zhan

引用次数: 1

Abstract

This paper propose a variant of AnyBoost for a large vocabulary continuous speech recognition (LVCSR) task. AnyBoost is an efficient algorithm to train an ensemble of weak learners by gradient descent for an objective function.We present a novel training procedure that trains acoustic models via the MMI criterion using data that is weighted proportional to the summation of the posterior functions of previous round of weak learners. Optimized for system combination by n-best ROVER at runtime, data weights for a new weak learner are computed as a weighted summation of posteriors of previous weak learners. We compare a frame-based version and a sentence-based version of our proposed algorithm with a frame-based AdaBoost algorithm. We will present results on a voice search task trained with different amounts of data with gains of 5.1% to 7.5% relative in WER can be obtained by three rounds of boosting.

查看原文本刊更多论文

具有MMI标准的LVCSR的帧级AnyBoost

本文提出了AnyBoost的一种变体，用于大词汇量连续语音识别(LVCSR)任务。AnyBoost是一种利用梯度下降方法训练弱学习器集合的有效算法。我们提出了一种新的训练过程，通过MMI标准训练声学模型，使用与前一轮弱学习者的后验函数之和成比例的加权数据。在运行时通过n-best ROVER对系统组合进行优化，新的弱学习器的数据权重计算为先前弱学习器的后验加权和。我们将基于帧的算法和基于句子的算法与基于帧的AdaBoost算法进行比较。我们将展示用不同数据量训练的语音搜索任务的结果，通过三轮提升，相对于WER可以获得5.1%到7.5%的增益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 IEEE Workshop on Automatic Speech Recognition & Understanding

自引率

0.00%

发文量