Martin Krawczyk-Becker, Dörte Fischer, Timo Gerkmann
{"title":"Utilizing spectro-temporal correlations for an improved speech presence probability based noise power estimation","authors":"Martin Krawczyk-Becker, Dörte Fischer, Timo Gerkmann","doi":"10.1109/ICASSP.2015.7177992","DOIUrl":null,"url":null,"abstract":"For the enhancement of speech degraded by noise, accurate estimation of the noise power spectral density (PSD) is indispensable, especially if only a single microphone signal is available. Fast and accurate tracking of the noise PSD is particularly challenging in highly non-stationary noise types, since the distinction between speech and noise components becomes more difficult. Short-time discrete Fourier transform (STFT) based noise PSD estimation algorithms which employ estimates of the speech presence probability (SPP) with fixed priors have been shown to yield good tracking performance even in adverse noise conditions. In this paper, we compare two methods to incorporate spectro-temporal correlations to improve the tracking performance. The first method smoothes the noisy observation over time and frequency before computing the SPP, while the second is based on a Hidden Markov Model (HMM) of the speech presence and absence states. We show that the proposed modifications lead to improved noise PSD estimators which are less sensitive to spectral outliers of the noise and track changes in the noise PSD more quickly than the reference method. Further, when employed in a common speech enhancement setup, the proposed estimators achieve an increased noise reduction while keeping speech distortions at a comparable level.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2015.7177992","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
For the enhancement of speech degraded by noise, accurate estimation of the noise power spectral density (PSD) is indispensable, especially if only a single microphone signal is available. Fast and accurate tracking of the noise PSD is particularly challenging in highly non-stationary noise types, since the distinction between speech and noise components becomes more difficult. Short-time discrete Fourier transform (STFT) based noise PSD estimation algorithms which employ estimates of the speech presence probability (SPP) with fixed priors have been shown to yield good tracking performance even in adverse noise conditions. In this paper, we compare two methods to incorporate spectro-temporal correlations to improve the tracking performance. The first method smoothes the noisy observation over time and frequency before computing the SPP, while the second is based on a Hidden Markov Model (HMM) of the speech presence and absence states. We show that the proposed modifications lead to improved noise PSD estimators which are less sensitive to spectral outliers of the noise and track changes in the noise PSD more quickly than the reference method. Further, when employed in a common speech enhancement setup, the proposed estimators achieve an increased noise reduction while keeping speech distortions at a comparable level.