Speech enhancement by iterating forward pass through U-net

2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA) Pub Date : 2020-09-23 DOI:10.23919/spa50552.2020.9241307

Tomasz Grzywalski, S. Drgas

引用次数: 1

Abstract

In recent years speech enhancement has shown great progress that was driven mostly by using bigger and more sophisticated neural networks. In this work we investigate the possibility to use state-of-the-art speech enhancement neural network and modify it in such a way that will allow it to process the noisy signal multiple times. By doing so we expect, that with each iteration the enhancement will improve. Experiments conducted using the WSJ0, Noisex-92 and DCASE datasets show, that U-net with gated dilated convolutions is able to achieve better SI-SDR, STOI and PESQ after processing the noisy signal two times, with the improvement being consistent across all SNRs and tested noise types. This is achieved without any additional trainable parameters and no additional memory requirements compared to the baseline model.

查看原文本刊更多论文

通过U-net迭代前向传递的语音增强

近年来，语音增强取得了巨大的进步，这主要是由使用更大、更复杂的神经网络推动的。在这项工作中，我们研究了使用最先进的语音增强神经网络的可能性，并以这样一种方式对其进行修改，使其能够多次处理噪声信号。通过这样做，我们期望，随着每次迭代，增强将得到改进。使用WSJ0、Noisex-92和DCASE数据集进行的实验表明，门控扩展卷积的U-net在对噪声信号进行两次处理后，能够获得更好的SI-SDR、STOI和PESQ，并且在所有信噪比和测试噪声类型中都有一致的改进。与基线模型相比，不需要任何额外的可训练参数，也不需要额外的内存需求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)

自引率

0.00%

发文量