Musical Similarity and Commonness Estimation Based on Probabilistic Generative Models

2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI:10.1142/S1793351X1640002X

Tomoyasu Nakano, Kazuyoshi Yoshii, Masataka Goto

{"title":"Musical Similarity and Commonness Estimation Based on Probabilistic Generative Models","authors":"Tomoyasu Nakano, Kazuyoshi Yoshii, Masataka Goto","doi":"10.1142/S1793351X1640002X","DOIUrl":null,"url":null,"abstract":"This paper proposes a novel concept we call musical commonness, which is the similarity of a song to a set of songs, in other words, its typicality. This commonness can be used to retrieve representative songs from a song set (e.g., songs released in the 80s or 90s). Previous research on musical similarity has compared two songs but has not evaluated the similarity of a song to a set of songs. The methods presented here for estimating the similarity and commonness of polyphonic musical audio signals are based on a unified framework of probabilistic generative modeling of four musical elements (vocal timbre, musical timbre, rhythm, and chord progression). To estimate the commonness, we use a generative model trained from a song set instead of estimating musical similarities of all possible song-pairs by using a model trained from each song. In experimental evaluation, we used 3278 popular music songs. Estimated song-pair similarities are comparable to ratings by a musician at the 0.1% significance level for vocal and musical timbre, at the 1% level for rhythm, and the 5% level for chord progression. Results of commonness evaluation show that the higher the musical commonness is, the more similar a song is to songs of a song set.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Symposium on Multimedia (ISM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/S1793351X1640002X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

This paper proposes a novel concept we call musical commonness, which is the similarity of a song to a set of songs, in other words, its typicality. This commonness can be used to retrieve representative songs from a song set (e.g., songs released in the 80s or 90s). Previous research on musical similarity has compared two songs but has not evaluated the similarity of a song to a set of songs. The methods presented here for estimating the similarity and commonness of polyphonic musical audio signals are based on a unified framework of probabilistic generative modeling of four musical elements (vocal timbre, musical timbre, rhythm, and chord progression). To estimate the commonness, we use a generative model trained from a song set instead of estimating musical similarities of all possible song-pairs by using a model trained from each song. In experimental evaluation, we used 3278 popular music songs. Estimated song-pair similarities are comparable to ratings by a musician at the 0.1% significance level for vocal and musical timbre, at the 1% level for rhythm, and the 5% level for chord progression. Results of commonness evaluation show that the higher the musical commonness is, the more similar a song is to songs of a song set.

查看原文本刊更多论文

基于概率生成模型的音乐相似性和共性估计

本文提出了一个新的概念，我们称之为音乐共性，它是一首歌与一组歌曲的相似性，换句话说，它的典型性。这种共性可用于从歌曲集中检索代表性歌曲(例如，80年代或90年代发行的歌曲)。之前关于音乐相似性的研究比较了两首歌曲，但没有评估一首歌与一组歌曲的相似性。本文提出的估计复调音乐音频信号相似性和共性的方法是基于四个音乐元素(人声音色、音乐音色、节奏和和弦进行)的概率生成建模的统一框架。为了估计共性，我们使用从歌曲集训练的生成模型，而不是使用从每首歌曲训练的模型来估计所有可能的歌曲对的音乐相似性。在实验评估中，我们使用了3278首流行音乐歌曲。估计的歌曲对相似性可与音乐家在声乐和音乐音色的0.1%显著性水平、节奏的1%显著性水平和和弦进行的5%显著性水平下的评分相媲美。共性评价结果表明，音乐共性越高，一首歌曲与一套歌曲中的歌曲越相似。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 IEEE International Symposium on Multimedia (ISM)

自引率

0.00%

发文量