{"title":"小波包变换在自动噪声语音识别系统中的应用","authors":"B. Kotnik, Z. Kacic, B. Horvat","doi":"10.1109/EURCON.2003.1248166","DOIUrl":null,"url":null,"abstract":"In this paper a noise robust speech feature extraction algorithm using wavelet packet decomposition (WPD) of the speech signal is presented. In contrast to the time-frequency signal representation based on short-time Fourier transform (STFT), a computational efficient WPD can lead to good representation of stationary (vowel phonemes) as well as non-stationary (consonants) segments of the speech signal. In the proposed WPD scheme a novel wavelet function is developed and presented. The noise robustness is improved with the application of proposed wavelet based denoising algorithm with the modified soft thresholding procedure. For decorrelation of feature vector elements and dimensionality reduction of final feature vector a principal component analysis (PCA) is used. Automatic speech recognition results on Aurora 3 database show performance improvement when compared to the standardized mel-frequency cepstral coefficients (MFCC) feature extraction algorithm.","PeriodicalId":337983,"journal":{"name":"The IEEE Region 8 EUROCON 2003. Computer as a Tool.","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"The usage of wavelet packet transformation in automatic noisy speech recognition systems\",\"authors\":\"B. Kotnik, Z. Kacic, B. Horvat\",\"doi\":\"10.1109/EURCON.2003.1248166\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper a noise robust speech feature extraction algorithm using wavelet packet decomposition (WPD) of the speech signal is presented. In contrast to the time-frequency signal representation based on short-time Fourier transform (STFT), a computational efficient WPD can lead to good representation of stationary (vowel phonemes) as well as non-stationary (consonants) segments of the speech signal. In the proposed WPD scheme a novel wavelet function is developed and presented. The noise robustness is improved with the application of proposed wavelet based denoising algorithm with the modified soft thresholding procedure. For decorrelation of feature vector elements and dimensionality reduction of final feature vector a principal component analysis (PCA) is used. Automatic speech recognition results on Aurora 3 database show performance improvement when compared to the standardized mel-frequency cepstral coefficients (MFCC) feature extraction algorithm.\",\"PeriodicalId\":337983,\"journal\":{\"name\":\"The IEEE Region 8 EUROCON 2003. Computer as a Tool.\",\"volume\":\"102 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-12-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The IEEE Region 8 EUROCON 2003. Computer as a Tool.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EURCON.2003.1248166\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The IEEE Region 8 EUROCON 2003. Computer as a Tool.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EURCON.2003.1248166","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The usage of wavelet packet transformation in automatic noisy speech recognition systems
In this paper a noise robust speech feature extraction algorithm using wavelet packet decomposition (WPD) of the speech signal is presented. In contrast to the time-frequency signal representation based on short-time Fourier transform (STFT), a computational efficient WPD can lead to good representation of stationary (vowel phonemes) as well as non-stationary (consonants) segments of the speech signal. In the proposed WPD scheme a novel wavelet function is developed and presented. The noise robustness is improved with the application of proposed wavelet based denoising algorithm with the modified soft thresholding procedure. For decorrelation of feature vector elements and dimensionality reduction of final feature vector a principal component analysis (PCA) is used. Automatic speech recognition results on Aurora 3 database show performance improvement when compared to the standardized mel-frequency cepstral coefficients (MFCC) feature extraction algorithm.