Less is more: Faster and better music version identification with embedding distillation

International Society for Music Information Retrieval Conference Pub Date : 2020-10-07 DOI:10.5281/ZENODO.4245570

Furkan Yesiler, J. Serrà, E. Gómez

引用次数: 7

Abstract

Version identification systems aim to detect different renditions of the same underlying musical composition (loosely called cover songs). By learning to encode entire recordings into plain vector embeddings, recent systems have made significant progress in bridging the gap between accuracy and scalability, which has been a key challenge for nearly two decades. In this work, we propose to further narrow this gap by employing a set of data distillation techniques that reduce the embedding dimensionality of a pre-trained state-of-the-art model. We compare a wide range of techniques and propose new ones, from classical dimensionality reduction to more sophisticated distillation schemes. With those, we obtain 99% smaller embeddings that, moreover, yield up to a 3% accuracy increase. Such small embeddings can have an important impact in retrieval time, up to the point of making a real-world system practical on a standalone laptop.

查看原文本刊更多论文

少即是多:更快，更好的音乐版本识别嵌入蒸馏

版本识别系统的目标是检测相同的潜在音乐作品(统称为翻唱歌曲)的不同版本。通过学习将整个录音编码为普通向量嵌入，最近的系统在弥合准确性和可扩展性之间的差距方面取得了重大进展，这是近二十年来的关键挑战。在这项工作中，我们建议通过采用一组数据蒸馏技术来进一步缩小这一差距，这些技术可以降低预训练的最先进模型的嵌入维数。我们比较了广泛的技术，并提出了新的，从经典的降维更复杂的蒸馏方案。有了这些，我们获得了99%的小嵌入，而且准确度提高了3%。这种小的嵌入可以对检索时间产生重要影响，直到在独立的笔记本电脑上实现真实世界的系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Society for Music Information Retrieval Conference

自引率

0.00%

发文量