{"title":"Using Phase Space based processing to extract proper features for ASR systems","authors":"Y. Shekofteh, F. Almasganj","doi":"10.1109/ISTEL.2010.5734094","DOIUrl":null,"url":null,"abstract":"In this paper a feature extraction technique using Reconstructed Phase Spaces (RPS) is presented, which improves the overall performances of typical speech recognition systems. Unlike conventional feature extraction methods that use FFT based algorithm as power spectrum estimation (PSE) of speech signal, the proposed method is based on the trajectory and flow matrix of signal's RPS. In this manner, a new representation of power spectrum is obtained using two dimensional DFT algorithm by which, we can gain modify versions of common feature extraction methods such as MFCC. We conducted some speech recognition experiments using HTK, the known HMM-based toolkit, over FARSDAT, a known Persian speech corpus. Through this modified version of feature extraction method, we gained 1.35% word error rate improvement in comparison to the baseline system which exploits the typical MFCC feature extraction method.","PeriodicalId":306663,"journal":{"name":"2010 5th International Symposium on Telecommunications","volume":"1949 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 5th International Symposium on Telecommunications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISTEL.2010.5734094","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
In this paper a feature extraction technique using Reconstructed Phase Spaces (RPS) is presented, which improves the overall performances of typical speech recognition systems. Unlike conventional feature extraction methods that use FFT based algorithm as power spectrum estimation (PSE) of speech signal, the proposed method is based on the trajectory and flow matrix of signal's RPS. In this manner, a new representation of power spectrum is obtained using two dimensional DFT algorithm by which, we can gain modify versions of common feature extraction methods such as MFCC. We conducted some speech recognition experiments using HTK, the known HMM-based toolkit, over FARSDAT, a known Persian speech corpus. Through this modified version of feature extraction method, we gained 1.35% word error rate improvement in comparison to the baseline system which exploits the typical MFCC feature extraction method.