New approaches to speech enhancement using phase correction in Wiener filtering

P. Fardkhaleghi, M. Savoji
{"title":"New approaches to speech enhancement using phase correction in Wiener filtering","authors":"P. Fardkhaleghi, M. Savoji","doi":"10.1109/ISTEL.2010.5734149","DOIUrl":null,"url":null,"abstract":"Typical speech enhancement algorithms that operate in the Fourier domain only modify the magnitude component of the noisy speech. It is commonly understood that the phase component is perceptually unimportant, and thus, it is passed directly to the output. Nevertheless, it has been reported in recent experiments that the Short-Time Fourier Transform (STFT) phase spectrum contributes significantly to speech intelligibility. Motivated by this, we investigated the role of phase spectrum in speech enhancement using Wiener filtering and Martin's minimum statistics. In this paper we report on results obtained using optimization algorithms, for phase correction of each processed frame, that intend to match the waveform of the zero-phase Wiener filtered speech to the conventional filter output obtained with noisy phase characteristic. No a priori information on the original phase is assumed. We show that better results are achieved using phase correction for different noise types. Different criteria are used for optimization with results similar to the case when the actual clean speech phase is at hand. Almost as good results are also obtained when minimizing the Wiener filter impulse response dispersion. The achieved improvement is assessed through different measurements such as signal to noise ratio (SNR), Segmental signal to noise ratio, and Perceptual Estimation of Speech Quality (PESQ).","PeriodicalId":306663,"journal":{"name":"2010 5th International Symposium on Telecommunications","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 5th International Symposium on Telecommunications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISTEL.2010.5734149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Typical speech enhancement algorithms that operate in the Fourier domain only modify the magnitude component of the noisy speech. It is commonly understood that the phase component is perceptually unimportant, and thus, it is passed directly to the output. Nevertheless, it has been reported in recent experiments that the Short-Time Fourier Transform (STFT) phase spectrum contributes significantly to speech intelligibility. Motivated by this, we investigated the role of phase spectrum in speech enhancement using Wiener filtering and Martin's minimum statistics. In this paper we report on results obtained using optimization algorithms, for phase correction of each processed frame, that intend to match the waveform of the zero-phase Wiener filtered speech to the conventional filter output obtained with noisy phase characteristic. No a priori information on the original phase is assumed. We show that better results are achieved using phase correction for different noise types. Different criteria are used for optimization with results similar to the case when the actual clean speech phase is at hand. Almost as good results are also obtained when minimizing the Wiener filter impulse response dispersion. The achieved improvement is assessed through different measurements such as signal to noise ratio (SNR), Segmental signal to noise ratio, and Perceptual Estimation of Speech Quality (PESQ).
基于维纳滤波相位校正的语音增强新方法
典型的语音增强算法在傅里叶域中只修改有噪声语音的幅度分量。通常认为相位分量在感知上是不重要的,因此,它被直接传递到输出。然而,在最近的实验中已经报道了短时傅里叶变换(STFT)相位谱对语音可理解性有重要贡献。基于此,我们利用维纳滤波和马丁最小统计量研究了相位谱在语音增强中的作用。在本文中,我们报告了利用优化算法获得的结果,对每个处理帧进行相位校正,旨在将零相位维纳滤波后的语音波形与具有噪声相位特性的常规滤波器输出相匹配。不假设原始阶段的先验信息。结果表明,对于不同的噪声类型,相位校正可以获得更好的结果。使用不同的标准进行优化,其结果与实际的干净语音阶段相似。当最小化维纳滤波器脉冲响应色散时,也获得了几乎同样好的结果。通过不同的测量,如信噪比(SNR)、分段信噪比和语音质量感知估计(PESQ)来评估所取得的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信