On a robust ASR based on robust complex speech analysis

2015 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS) Pub Date : 2015-11-01 DOI:10.1109/ISPACS.2015.7432751

Keita Higa, K. Funaki

引用次数: 0

Abstract

The European Telecommunications Standards Institute (ETSI) standardized the advanced front-end (AFE) for automatic speech recognition (ASR). In the AFE a speech enhancement is realized by an iterative Wiener filter (IWF) in which a smoothed FFT spectrum over adjacent frames is used to design the filter. On the other hand, we have previously proposed robust time-varying complex AR (TV-CAR) speech analysis and evaluated the performance of speech processing such as F0 estimation and speech enhancement. TV-CAR analysis can estimate more accurate spectrum than FFT, especially in low frequencies because of the nature of the analytic signal. In addition, TV-CAR can estimate more accurate speech spectrum against additive noise. In this paper, a time-invariant version of wide-band TV-CAR analysis based on robust Extended Least Square (ELS) approach is introduced to the IWF in the AFE and is evaluated using the CENSREC-2 database.

查看原文本刊更多论文

基于鲁棒复杂语音分析的鲁棒ASR

欧洲电信标准协会(ETSI)对自动语音识别(ASR)的高级前端(AFE)进行了标准化。在AFE中，语音增强是通过迭代维纳滤波器(IWF)实现的，其中使用相邻帧上的平滑FFT频谱来设计滤波器。另一方面，我们之前提出了鲁棒时变复杂AR (TV-CAR)语音分析，并评估了语音处理的性能，如F0估计和语音增强。由于分析信号的性质，TV-CAR分析可以比FFT估计更准确的频谱，特别是在低频。此外，TV-CAR可以更准确地估计出加性噪声下的语音频谱。本文将一种基于鲁棒扩展最小二乘(ELS)方法的宽带TV-CAR分析的时不变版本引入到AFE的IWF中，并使用censrec2数据库进行了评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)

自引率

0.00%

发文量