Sams-Net: A Sliced Attention-based Neural Network for Music Source Separation

2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP) Pub Date : 2019-09-12 DOI:10.1109/ISCSLP49672.2021.9362081

Tingle Li, Jiawei Chen, Haowen Hou, Ming Li

引用次数: 16

Abstract

Convolutional Neural Network (CNN) or Long Short-term Memory (LSTM) based models with the input of spectrogram or waveforms are commonly used for deep learning based audio source separation. In this paper, we propose a Sliced Attention-based neural network (Sams-Net) in the spectrogram domain for the music source separation task. It enables spectral feature interactions with multi-head attention mechanism, achieves easier parallel computing and has a larger receptive field com-pared with LSTMs and CNNs respectively. Experimental results on the MUSDB18 dataset show that the proposed method, with fewer parameters, outperforms most of the state-of-the-art DNN-based methods.

查看原文本刊更多论文

Sams-Net:一种用于音乐源分离的基于注意力的神经网络

基于卷积神经网络(CNN)或长短期记忆(LSTM)模型，输入谱图或波形，通常用于基于深度学习的音频源分离。在本文中，我们提出了一种基于切片注意力的神经网络(Sams-Net)，用于谱图域的音乐源分离任务。与lstm和cnn相比，它可以实现谱特征与多头注意机制的交互，实现更容易的并行计算，并且具有更大的接受野。在MUSDB18数据集上的实验结果表明，该方法参数较少，优于大多数最先进的基于dnn的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)

自引率

0.00%

发文量