Robust speech recognition by properly utilizing reliable frames and segments in corrupted signals

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI:10.1109/ASRU.2007.4430091

Yi Chen, C. Wan, Lin-Shan Lee

引用次数: 1

Abstract

In this paper, we propose a new approach to detecting and utilizing reliable frames and segments in corrupted signals for robust speech recognition. Novel approaches to estimating an energy-based measure and a harmonicity measure for each frame are developed. SNR-dependent GMM classifiers are then trained, together with a reliable frame selection and clustering module and a reliable segment identification module, to detect the most reliable frames in an utterance. These reliable frames and segments thus obtained can be properly used in both front-end feature enhancement and back-end Viterbi decoding. In the extensive experiments reported here, very significant improvements in recognition accuracies were obtained with the proposed approaches for all types of noise and all SNR values defined in the Aurora 2 database.

查看原文本刊更多论文

通过在损坏信号中适当地利用可靠的帧和段来实现鲁棒语音识别

在本文中，我们提出了一种新的方法来检测和利用可靠的帧和段在损坏的信号鲁棒语音识别。提出了一种新的方法来估计基于能量的度量和每帧的谐波度量。然后训练依赖于信噪比的GMM分类器，以及可靠的帧选择和聚类模块和可靠的片段识别模块，以检测话语中最可靠的帧。这些可靠的帧和段既可以用于前端特征增强，也可以用于后端Viterbi解码。在这里报道的大量实验中，对于极光2号数据库中定义的所有类型的噪声和所有信噪比值，所提出的方法都获得了非常显著的识别精度提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)

自引率

0.00%

发文量