User-Guided One-Shot Deep Model Adaptation for Music Source Separation

2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) Pub Date : 2021-10-17 DOI:10.1109/WASPAA52581.2021.9632717

Giorgia Cantisani, A. Ozerov, S. Essid, G. Richard

引用次数: 2

Abstract

Music source separation is the task of isolating individual instruments which are mixed in a musical piece. This task is particularly challenging, and even state-of-the-art models can hardly generalize to unseen test data. Nevertheless, prior knowledge about individual sources can be used to better adapt a generic source separation model to the observed signal. In this work, we propose to exploit a temporal segmentation provided by the user, that indicates when each instrument is active, in order to fine-tune a pre-trained deep model for source separation and adapt it to one specific mixture. This paradigm can be referred to as user-driven one-shot deep model adaptation for music source separation, as the adaptation acts on the target song instance only. Our results are promising and show that state-of-the-art source separation models have large margins of improvement especially for those instruments which are underrepresented in the training data.

查看原文本刊更多论文

用户引导的单镜头深度模型自适应音乐源分离

音乐源分离是将混合在一个音乐作品中的单个乐器分离出来的任务。这项任务特别具有挑战性，即使是最先进的模型也很难推广到未见过的测试数据。然而，关于单个源的先验知识可以用来更好地适应观测信号的通用源分离模型。在这项工作中，我们建议利用用户提供的时间分割，这表明每个仪器何时处于活动状态，以便微调预训练的深度模型，用于源分离并使其适应特定的混合。这个范例可以被称为用户驱动的音乐源分离的单次深度模型适应，因为这种适应只作用于目标歌曲实例。我们的结果是有希望的，并且表明最先进的源分离模型有很大的改进余地，特别是对于那些在训练数据中代表性不足的仪器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

自引率

0.00%

发文量