Using Phase Space based processing to extract proper features for ASR systems

2010 5th International Symposium on Telecommunications Pub Date : 2010-12-01 DOI:10.1109/ISTEL.2010.5734094

Y. Shekofteh, F. Almasganj

引用次数: 10

Abstract

In this paper a feature extraction technique using Reconstructed Phase Spaces (RPS) is presented, which improves the overall performances of typical speech recognition systems. Unlike conventional feature extraction methods that use FFT based algorithm as power spectrum estimation (PSE) of speech signal, the proposed method is based on the trajectory and flow matrix of signal's RPS. In this manner, a new representation of power spectrum is obtained using two dimensional DFT algorithm by which, we can gain modify versions of common feature extraction methods such as MFCC. We conducted some speech recognition experiments using HTK, the known HMM-based toolkit, over FARSDAT, a known Persian speech corpus. Through this modified version of feature extraction method, we gained 1.35% word error rate improvement in comparison to the baseline system which exploits the typical MFCC feature extraction method.

查看原文本刊更多论文

采用基于相空间的处理方法提取ASR系统的特征

本文提出了一种基于重构相空间(RPS)的特征提取技术，提高了典型语音识别系统的整体性能。与传统的基于FFT算法的语音信号功率谱估计(PSE)特征提取方法不同，该方法基于信号RPS的轨迹和流矩阵。在此基础上，利用二维DFT算法得到了一种新的功率谱表示形式，从而得到了MFCC等常用特征提取方法的改进版本。我们使用HTK(已知的基于hmm的工具包)在FARSDAT(已知的波斯语语音语料库)上进行了一些语音识别实验。通过改进的特征提取方法，与利用典型的MFCC特征提取方法的基线系统相比，我们的错误率提高了1.35%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 5th International Symposium on Telecommunications

自引率

0.00%

发文量