A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement

2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2017-09-24 DOI:10.1109/MMSP.2018.8547084

J. Valin

引用次数: 139

Abstract

Despite noise suppression being a mature area in signal processing, it remains highly dependent on fine tuning of estimator algorithms and parameters. In this paper, we demonstrate a hybrid DSP/deep learning approach to noise suppression. We focus strongly on keeping the complexity as low as possible, while still achieving high-quality enhanced speech. A deep recurrent neural network with four hidden layers is used to estimate ideal critical band gains, while a more traditional pitch filter attenuates noise between pitch harmonics. The approach achieves significantly higher quality than a traditional minimum mean squared error spectral estimator, while keeping the complexity low enough for real-time operation at 48 kHz on a low-power CPU.

查看原文本刊更多论文

基于DSP/深度学习的实时全频带语音增强方法

尽管噪声抑制是信号处理中一个成熟的领域，但它仍然高度依赖于估计器算法和参数的微调。在本文中，我们展示了一种混合DSP/深度学习的噪声抑制方法。我们非常注重保持尽可能低的复杂性，同时仍然实现高质量的增强语音。一个具有四个隐藏层的深度递归神经网络用于估计理想的临界频带增益，而一个更传统的基音滤波器用于衰减基音谐波之间的噪声。该方法的质量明显高于传统的最小均方误差谱估计器，同时保持足够低的复杂性，可以在低功耗CPU上以48 kHz的频率实时运行。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)

自引率

0.00%

发文量