{"title":"Feature extraction using discrete wavelet transform for speech recognition","authors":"Z. Tufekci, J. Gowdy","doi":"10.1109/SECON.2000.845444","DOIUrl":null,"url":null,"abstract":"We propose a new feature vector consisting of mel-frequency discrete wavelet coefficients (MFDWC). The MFDWC are obtained by applying the discrete wavelet transform (DWT) to the mel-scaled log filterbank energies of a speech frame. The purpose of using the DWT is to benefit from its localization property in the time and frequency domains. MFDWC are similar to subband-based (SUB) features and multi-resolution (MULT) features in that both attempt to achieve good time and frequency localization. However, MFDWC have better time/frequency localization than SUB features and MULT features. We evaluated the performance of new features for clean speech and noisy speech and compared the performance of MFDWC with mel-frequency cepstral coefficients (MFCC), SUB features and MULT features. Experimental results on a phoneme recognition task showed that a MFDWC-based recognizer gave better results than recognizers based on MFCC, SUB features, and MULT features for white Gaussian noise, band-limited white Gaussian noise and clean speech cases.","PeriodicalId":206022,"journal":{"name":"Proceedings of the IEEE SoutheastCon 2000. 'Preparing for The New Millennium' (Cat. No.00CH37105)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"103","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the IEEE SoutheastCon 2000. 'Preparing for The New Millennium' (Cat. No.00CH37105)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SECON.2000.845444","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 103
Abstract
We propose a new feature vector consisting of mel-frequency discrete wavelet coefficients (MFDWC). The MFDWC are obtained by applying the discrete wavelet transform (DWT) to the mel-scaled log filterbank energies of a speech frame. The purpose of using the DWT is to benefit from its localization property in the time and frequency domains. MFDWC are similar to subband-based (SUB) features and multi-resolution (MULT) features in that both attempt to achieve good time and frequency localization. However, MFDWC have better time/frequency localization than SUB features and MULT features. We evaluated the performance of new features for clean speech and noisy speech and compared the performance of MFDWC with mel-frequency cepstral coefficients (MFCC), SUB features and MULT features. Experimental results on a phoneme recognition task showed that a MFDWC-based recognizer gave better results than recognizers based on MFCC, SUB features, and MULT features for white Gaussian noise, band-limited white Gaussian noise and clean speech cases.