Neural network alternatives toconvolutive audio models for source separation

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2017-09-01 DOI:10.1109/MLSP.2017.8168108

Shrikant Venkataramani, Cem Subakan, P. Smaragdis

引用次数: 14

Abstract

Convolutive Non-Negative Matrix Factorization model factorizes a given audio spectrogram using frequency templates with a temporal dimension. In this paper, we present a convolutional auto-encoder model that acts as a neural network alternative to convolutive NMF. Using the modeling flexibility granted by neural networks, we also explore the idea of using a Recurrent Neural Network in the encoder. Experimental results on speech mixtures from TIMIT dataset indicate that the convolutive architecture provides a significant improvement in separation performance in terms of BSS eval metrics.

查看原文本刊更多论文

用于源分离的卷积音频模型的神经网络替代品

卷积非负矩阵分解模型使用具有时间维的频率模板来分解给定的音频频谱图。在本文中，我们提出了一个卷积自编码器模型，作为卷积NMF的神经网络替代方案。利用神经网络赋予的建模灵活性，我们还探索了在编码器中使用循环神经网络的想法。对TIMIT数据集混合语音的实验结果表明，从BSS评估指标来看，卷积架构在分离性能上有显著提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)

自引率

0.00%

发文量