Learning Unsupervised Hierarchies of Audio Concepts

International Society for Music Information Retrieval Conference Pub Date : 2022-07-21 DOI:10.48550/arXiv.2207.11231

Darius Afchar, Romain Hennequin, Vincent Guigue

{"title":"Learning Unsupervised Hierarchies of Audio Concepts","authors":"Darius Afchar, Romain Hennequin, Vincent Guigue","doi":"10.48550/arXiv.2207.11231","DOIUrl":null,"url":null,"abstract":"Music signals are difficult to interpret from their low-level features, perhaps even more than images: e.g. highlighting part of a spectrogram or an image is often insufficient to convey high-level ideas that are genuinely relevant to humans. In computer vision, concept learning was therein proposed to adjust explanations to the right abstraction level (e.g. detect clinical concepts from radiographs). These methods have yet to be used for MIR. In this paper, we adapt concept learning to the realm of music, with its particularities. For instance, music concepts are typically non-independent and of mixed nature (e.g. genre, instruments, mood), unlike previous work that assumed disentangled concepts. We propose a method to learn numerous music concepts from audio and then automatically hierarchise them to expose their mutual relationships. We conduct experiments on datasets of playlists from a music streaming service, serving as a few annotated examples for diverse concepts. Evaluations show that the mined hierarchies are aligned with both ground-truth hierarchies of concepts -- when available -- and with proxy sources of concept similarity in the general case.","PeriodicalId":309903,"journal":{"name":"International Society for Music Information Retrieval Conference","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Society for Music Information Retrieval Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2207.11231","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Music signals are difficult to interpret from their low-level features, perhaps even more than images: e.g. highlighting part of a spectrogram or an image is often insufficient to convey high-level ideas that are genuinely relevant to humans. In computer vision, concept learning was therein proposed to adjust explanations to the right abstraction level (e.g. detect clinical concepts from radiographs). These methods have yet to be used for MIR. In this paper, we adapt concept learning to the realm of music, with its particularities. For instance, music concepts are typically non-independent and of mixed nature (e.g. genre, instruments, mood), unlike previous work that assumed disentangled concepts. We propose a method to learn numerous music concepts from audio and then automatically hierarchise them to expose their mutual relationships. We conduct experiments on datasets of playlists from a music streaming service, serving as a few annotated examples for diverse concepts. Evaluations show that the mined hierarchies are aligned with both ground-truth hierarchies of concepts -- when available -- and with proxy sources of concept similarity in the general case.

查看原文本刊更多论文

学习音频概念的无监督层次

音乐信号很难从其低级特征来解释，甚至可能比图像更困难:例如，高亮显示频谱图或图像的一部分通常不足以传达与人类真正相关的高级思想。在计算机视觉中，概念学习被提出来调整解释到正确的抽象水平(例如从x光片中检测临床概念)。这些方法尚未用于MIR。在本文中，我们将概念学习运用到音乐领域中，这是它的特殊性。例如，音乐概念通常是非独立的，具有混合性质(例如，流派，乐器，情绪)，而不像以前的作品那样假设不纠缠的概念。我们提出了一种从音频中学习大量音乐概念的方法，然后自动对它们进行分层，以揭示它们之间的相互关系。我们对来自音乐流媒体服务的播放列表数据集进行实验，作为不同概念的几个注释示例。评估表明，挖掘的层次结构与概念的基础真理层次结构(当可用时)和一般情况下概念相似性的代理来源保持一致。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Society for Music Information Retrieval Conference

自引率

0.00%

发文量