On The Use of Discrete Cosine Transform Polarity Spectrum in Speech Enhancement

Sisi Shi, Andrew Busch, K. Paliwal, T. Fickenscher
{"title":"On The Use of Discrete Cosine Transform Polarity Spectrum in Speech Enhancement","authors":"Sisi Shi, Andrew Busch, K. Paliwal, T. Fickenscher","doi":"10.23919/Eusipco47968.2020.9287832","DOIUrl":null,"url":null,"abstract":"This paper investigates the use of short-time Discrete Cosine Transform (DCT) for speech enhancement. We denote the absolute values and signs of the DCT spectral coefficients as the Absolute Spectrum (AS) and Polarity Spectrum (PoS), respectively. We theoretically show that the noisy PoS is the best estimate of the original, under the constrained MMSE criterion. To verify this experimentally, the effect of using the noisy PoS for signal resynthesis is analysed through objective and subjective measures. The results show that when the Instantaneous SNR (ISNR) is above 0 dB, deemed as perfect, recovery of the original speech signal can be obtained only by modifying the DCT absolute spectrum. However, an accurate DFT Phase Spectrum (PhS) estimation might be required to achieve the same improvement in perceived speech quality. When the perceived quality is measured against the Segmental SNR (SSNR), it shows the PoS is more capable to conserve the speech quality than the PhS for the same level of global distortion. The results show that the noisy PoS can be used as an estimate of the clean PoS without perceivable degradation in speech quality, only if the ISNR of the noisy speech signal is above 0 dB or the SSNR is above 10.5 dB.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"122 1","pages":"421-425"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 28th European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/Eusipco47968.2020.9287832","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

This paper investigates the use of short-time Discrete Cosine Transform (DCT) for speech enhancement. We denote the absolute values and signs of the DCT spectral coefficients as the Absolute Spectrum (AS) and Polarity Spectrum (PoS), respectively. We theoretically show that the noisy PoS is the best estimate of the original, under the constrained MMSE criterion. To verify this experimentally, the effect of using the noisy PoS for signal resynthesis is analysed through objective and subjective measures. The results show that when the Instantaneous SNR (ISNR) is above 0 dB, deemed as perfect, recovery of the original speech signal can be obtained only by modifying the DCT absolute spectrum. However, an accurate DFT Phase Spectrum (PhS) estimation might be required to achieve the same improvement in perceived speech quality. When the perceived quality is measured against the Segmental SNR (SSNR), it shows the PoS is more capable to conserve the speech quality than the PhS for the same level of global distortion. The results show that the noisy PoS can be used as an estimate of the clean PoS without perceivable degradation in speech quality, only if the ISNR of the noisy speech signal is above 0 dB or the SSNR is above 10.5 dB.
离散余弦变换极性谱在语音增强中的应用
本文研究了短时离散余弦变换(DCT)在语音增强中的应用。我们将DCT谱系数的绝对值和符号分别表示为绝对谱(absolute Spectrum, as)和极性谱(Polarity Spectrum, PoS)。我们从理论上证明了在约束MMSE准则下,带噪声的PoS是原始PoS的最佳估计。为了实验验证这一点,通过客观和主观测量分析了使用带噪声PoS进行信号重合成的效果。结果表明,当瞬时信噪比(ISNR)大于0 dB时,仅通过修改DCT绝对频谱即可获得原始语音信号的恢复。然而,精确的DFT相位谱(ph)估计可能需要达到同样的改善感知语音质量。当感知质量相对于片段信噪比(SSNR)进行测量时,它表明在相同的全局失真水平下,PoS比PhS更能保持语音质量。结果表明,当含噪语音信号的ISNR大于0 dB或SSNR大于10.5 dB时,含噪语音信号可以作为纯净语音信号的估计,而不会导致语音质量的明显下降。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信