Very large vocabulary isolated utterance recognition: a comparison between one pass and two pass strategies

ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing Pub Date : 1988-04-11 DOI:10.1109/ICASSP.1988.196549

L. Fissore, P. Laface, G. Micca, R. Pieraccini

引用次数: 13

Abstract

A system for recognizing isolated utterances belonging to a very large vocabulary is presented that follows a two-pass strategy. The first step, hypothesization, consists in the selection of a subset of word candidates, starting from the segmentation of speech into six broad phonetic classes. This module is implemented through a dynamic programming algorithm working in a three-dimensional space. The search is performed on a tree representing a coarse description of the lexicon. The second step is the search for the best N candidates according to a maximum-likelihood criterion. Each word candidate is represented by a graph of subword hidden Markov models, and a tree structure of the whole word subset is built on line for an efficient implementation of the Viterbi algorithm. A comparison with a direct approach that does not use the hypothesization module shows that the two-pass approach has the same performance with an 80% reduction in computational complexity.<>

查看原文本刊更多论文

大词汇孤立话语识别:一过和两过策略的比较

本文提出了一种基于两步策略的孤立话语识别系统。第一步，假设，包括选择候选词子集，从语音分割成六个广泛的语音类开始。该模块通过在三维空间中工作的动态规划算法来实现。搜索在表示词典的粗略描述的树上执行。第二步是根据最大似然标准搜索最佳的N个候选对象。每个候选词由子词隐马尔可夫模型图表示，并在线构建整个词子集的树结构，以有效实现Viterbi算法。与不使用假设模块的直接方法的比较表明，两遍方法具有相同的性能，计算复杂性降低了80%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing

自引率

0.00%

发文量