Utilizing spectro-temporal correlations for an improved speech presence probability based noise power estimation

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2015-04-19 DOI:10.1109/ICASSP.2015.7177992

Martin Krawczyk-Becker, Dörte Fischer, Timo Gerkmann

{"title":"Utilizing spectro-temporal correlations for an improved speech presence probability based noise power estimation","authors":"Martin Krawczyk-Becker, Dörte Fischer, Timo Gerkmann","doi":"10.1109/ICASSP.2015.7177992","DOIUrl":null,"url":null,"abstract":"For the enhancement of speech degraded by noise, accurate estimation of the noise power spectral density (PSD) is indispensable, especially if only a single microphone signal is available. Fast and accurate tracking of the noise PSD is particularly challenging in highly non-stationary noise types, since the distinction between speech and noise components becomes more difficult. Short-time discrete Fourier transform (STFT) based noise PSD estimation algorithms which employ estimates of the speech presence probability (SPP) with fixed priors have been shown to yield good tracking performance even in adverse noise conditions. In this paper, we compare two methods to incorporate spectro-temporal correlations to improve the tracking performance. The first method smoothes the noisy observation over time and frequency before computing the SPP, while the second is based on a Hidden Markov Model (HMM) of the speech presence and absence states. We show that the proposed modifications lead to improved noise PSD estimators which are less sensitive to spectral outliers of the noise and track changes in the noise PSD more quickly than the reference method. Further, when employed in a common speech enhancement setup, the proposed estimators achieve an increased noise reduction while keeping speech distortions at a comparable level.","PeriodicalId":117666,"journal":{"name":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2015.7177992","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

For the enhancement of speech degraded by noise, accurate estimation of the noise power spectral density (PSD) is indispensable, especially if only a single microphone signal is available. Fast and accurate tracking of the noise PSD is particularly challenging in highly non-stationary noise types, since the distinction between speech and noise components becomes more difficult. Short-time discrete Fourier transform (STFT) based noise PSD estimation algorithms which employ estimates of the speech presence probability (SPP) with fixed priors have been shown to yield good tracking performance even in adverse noise conditions. In this paper, we compare two methods to incorporate spectro-temporal correlations to improve the tracking performance. The first method smoothes the noisy observation over time and frequency before computing the SPP, while the second is based on a Hidden Markov Model (HMM) of the speech presence and absence states. We show that the proposed modifications lead to improved noise PSD estimators which are less sensitive to spectral outliers of the noise and track changes in the noise PSD more quickly than the reference method. Further, when employed in a common speech enhancement setup, the proposed estimators achieve an increased noise reduction while keeping speech distortions at a comparable level.

查看原文本刊更多论文

利用频谱时间相关性改进语音存在概率噪声功率估计

为了增强受噪声影响的语音，准确估计噪声功率谱密度(PSD)是必不可少的，特别是在只有一个麦克风信号的情况下。在高度非平稳的噪声类型中，快速准确地跟踪噪声PSD尤其具有挑战性，因为语音和噪声成分之间的区分变得更加困难。基于短时离散傅立叶变换(STFT)的噪声PSD估计算法采用固定先验的语音存在概率(SPP)估计，即使在不利的噪声条件下也能产生良好的跟踪性能。在本文中，我们比较了两种结合光谱-时间相关性来提高跟踪性能的方法。第一种方法是在计算SPP之前平滑时间和频率上的噪声观测，而第二种方法是基于语音存在和缺席状态的隐马尔可夫模型(HMM)。我们表明，所提出的修改导致改进的噪声PSD估计器对噪声的谱异常值不那么敏感，并且比参考方法更快地跟踪噪声PSD的变化。此外，当在普通语音增强设置中使用时，所提出的估计器实现了更高的降噪，同时将语音失真保持在可比较的水平。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

自引率

0.00%

发文量